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Abstract 

ML-style  modules  are  valuable  in  the  development  and  mainte¬ 
nance  of  large  software  systems,  unfortunately,  none  of  the  existing 
languages  support  them  in  a  fully  satisfactory  manner.  The  offi¬ 
cial  SML’97  Definition  does  not  allow  higher-order  functors,  so  a 
module  that  refers  to  externally  defined  functors  cannot  accurately 
describe  its  import  interface.  MacQueen  and  Tofte  [26]  extended 
SML'97  with  fully  transparent  higher-order  functors,  but  their  sys¬ 
tem  does  not  have  a  type-theoretic  semantics  thus  fails  to  support 
fully  syntactic  signatures.  The  systems  of  manifest  types  [19,  20] 
and  translucent  sums  [12]  support  fully  syntactic  signatures  but 
they  may  propagate  fewer  type  equalities  than  fully  transparent 
functors.  This  paper  presents  a  module  calculus  that  supports  both 
fully  transparent  higher-order  functors  and  fully  syntactic  signa¬ 
tures  (and  thus  true  separate  compilation).  We  give  a  simple  type- 
theoretic  semantics  to  our  calculus  and  show  how  to  compile  it  into 
an  F^-like  A-calculus  extended  with  existential  types. 

1  Introduction 

Modular  programming  is  one  of  the  most  commonly  used  tech¬ 
niques  in  the  development  and  maintenance  of  large  software  sys¬ 
tems.  Using  modularization,  we  can  decompose  a  large  software 
project  into  smaller  pieces  (modules)  and  then  develop  and  under¬ 
stand  each  of  them  in  isolation.  The  key  ingredients  in  modular¬ 
ization  are  the  explicit  interfaces  used  to  model  inter-module  de¬ 
pendencies.  Good  interfaces  not  only  make  separate  compilation 
type-safe  but  also  allow  us  to  think  about  large  systems  without 
holding  the  whole  system  in  our  head  at  once.  A  powerful  module 
language  must  support  equally  expressive  interface  specifications 
in  order  to  achieve  the  optimal  results. 

1.1  Why  higher-order  functors? 

Standard  ML  [27,  28]  provides  a  powerful  module  system.  The 
main  innovation  of  the  ML  module  language  is  its  support  of  pa¬ 
rameterized  modules,  also  known  as  functors.  Unlike  Modula-3 
generics  [31]  or  C++  templates  [37],  ML  functors  can  be  type- 
checked  and  compiled  independently  at  its  definition  site;  further- 
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more,  different  applications  of  the  same  functor  can  share  a  single 
copy  of  the  implementation  (i.e.,  object  code),  even  though  each 
application  may  produce  modules  with  different  interfaces. 

Functors  have  proven  to  be  valuable  in  the  modeling  and  or¬ 
ganization  of  extensible  systems  [1,  10,  6,  32].  The  Fox  project  at 
CMU  [1]  uses  ML  functors  to  represent  the  TCP/IP  protocol  layers; 
through  functor  applications,  different  protocol  layers  can  be  mixed 
and  matched  to  generate  new  protocol  stacks  with  application- 
specific  requirements.  Also,  a  standard  C++  template  library  writ¬ 
ten  using  the  ML  functors  would  not  require  nasty  cascading  re¬ 
compilations  when  the  library  is  updated,  simply  because  ML  func¬ 
tors  can  be  compiled  separately  before  even  being  applied. 

Unfortunately,  any  use  of  functors  and  nested  modules  also  im¬ 
plies  that  the  underlying  module  language  must  support  higher- 
order  functors  (i.e.,  functors  passed  as  arguments  or  returned  as 
results  by  other  functors),  because  otherwise,  there  is  no  way  to 
accurately  specify  the  import  signature  of  a  module  that  refers  to 
externally  defined  functors.  For  example,  if  we  decompose  the 
following  ML  program  into  two  smaller  pieces,  one  for  FOO  and 
another  for  BAR: 

functor  FOO  (A  :  SIG)  =  . . . 


structure  BAR  =  struct  structure  B  =  ... 

structure  C  =  FOO(B) 

end 

the  fragment  for  BAR  must  treat  FOO  as  its  import  argument.  This 
essentially  turns  BAR  into  a  higher-order  functor  since  it  must  take 
another  functor  as  its  argument.  Without  higher-order  functors,  we 
cannot  fully  specify  the  interfaces1  of  arbitrary  ML  programs.  The 
lack  of  fully  syntactic  (i.e.,  explicit)  signatures  also  violates  the 
fundamental  principles  of  modularization  and  makes  it  impossible 
to  support  Modula-2  style  true  separate  compilation  [19]. 

1.2  Main  challenges 

Supporting  higher-order  functors  with  fully  syntactic  signatures 
turns  out  to  be  a  very  hard  problem.  Standard  ML  (SML)  [28] 
only  supports  first-order  functors.  MacQueen  and  Tofte  [26, 38]  ex¬ 
tended  SML  with  fully  transparent  higher-order  functors  but  their 
scheme  does  not  provide  fully  syntactic  signatures.  Independently, 
Harper  and  Lillibridge  [12]  and  Leroy  [19]  proposed  to  use  translu¬ 
cent  sums  and  manifest  types  to  model  type  sharing;  their  scheme 

1  We  only  need  to  write  the  signatures  for  first-order  functors  if  we  use  a  special 
“compilation  unit”  construct  with  import  and  export  statements,  but  reasoning  such 
construct  would  likely  require  similar  formalism  as  reasoning  higher-order  modules. 
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supports  fully  syntactic  signatures  but  fails  to  propagate  as  much 
sharing  as  in  the  MacQueen-Tofte  system.  Leroy  [20]  proposed  to 
use  applicative  semantics  to  model  full  transparency,  but  his  sig¬ 
nature  calculus  is  not  fully  syntactic  since  it  only  handles  limited 
forms  of  functor  expressions;  this  limitation  was  lifted  in  Courant's 
recent  proposal  [7],  but  only  at  the  expense  of  putting  arbitrary 
module  implementation  code  into  the  interfaces,  which  in  turn 
compromises  the  very  benefits  of  modularization  and  makes  inter¬ 
face  checking  much  harder. 

The  main  challenge  is  thus  to  design  a  module  language  that 
satisfies  all  of  the  following  properties: 

•  It  must  ha we  fully  syntactic  signatures:  if  we  split  a  program 
at  an  arbitrary  point,  the  corresponding  interface  must  be  ex¬ 
pressible  using  the  underlying  signature  calculus. 

•  It  must  have  simple  type-theoretical  semantics:  a  clean  se¬ 
mantics  makes  formal  reasoning  easier;  it  is  also  a  prerequi¬ 
site  for  a  simple  signature  calculus. 

•  It  should  support  fully  transparent  higher-order  functors: 
higher-order  functors  should  be  a  natural  extension  of  first- 
order  ones;  simple  ML  functors  can  propagate  type  sharing 
from  the  argument  to  the  result;  higher-order  functors  should 
propagate  sharing  in  the  same  way. 

•  It  should  support  opaque  types  and  signatures:  type  abstrac¬ 
tion  is  the  standard  method  of  hiding  implementation  details 
from  the  clients  of  a  module;  the  same  mechanism  should  be 
applicable  to  higher-order  functors  as  well. 

•  It  should  support  efficient  elaboration  and  implementation: 
a  module  system  will  not  be  practical  if  it  cannot  be  type- 
checked  and  compiled  efficiently;  compilation  of  module 
programs  should  also  be  compatible  with  the  standard  type- 
directed  compilation  techniques  [18,  15,  35,  36]. 

1.3  Our  contributions 


AMC 


Figure  1:  Relationship  among  five  different  calculi 


because  abstract  types  must  be  made  concrete  during  the  transla¬ 
tion.  The  translation  of  translucent  sums  is  even  more  problem¬ 
atic:  Crary  et  al  [8]  have  to  extend  Fu  with  singleton  and  depen¬ 
dent  kinds  to  capture  the  sharing  information  in  the  surface  lan¬ 
guage.  The  translation  based  on  our  new  interpretation  rightly 
turns  opaque  modules  and  abstract  types  into  simple  existential 
types.  Furthermore,  it  does  not  need  to  use  singleton  and  depen¬ 
dent  kinds.  This  is  significant  because  typechecking  singleton  and 
dependent  kinds  is  notoriously  difficult  [8], 

In  the  rest  of  this  paper,  we  first  use  a  series  of  examples  to 
informally  explain  the  main  ideas.  We  then  present  our  new  Ex¬ 
tended  Module  Calculus  (EMC)  which  supports  both  fully  trans¬ 
parent  higher-order  functors  and  fully  syntactic  signatures.  We 
demonstrate  the  expressiveness  of  EMC  by  translating  a  version  of 
the  Abstract  Module  Calculus  (AMC)  and  a  version  of  the  Trans¬ 
parent  Module  Calculus  (TMC)  into  the  EMC  calculus.  Finally,  to 
support  type-directed  compilation  [18,  15,  35,  36],  we  show  how 
EMC  can  be  translated  into  a  Kernel  Module  Calculus  (KMC)  and 
then  to  an  F^-like  Target  Calculus  (FTC).  The  relationship  among 
these  five  calculi  is  depicted  in  Figure  1 . 

2  Informal  Development 

2.1  Fully  transparent  higher-order  functors 

We  first  use  a  series  of  examples  to  show  how  the  MacQueen-Tofte 
system  [26]  supports  fully  transparent  higher-order  functors.  We 
start  by  defining  a  signature  SIG  and  a  functor  signature  FSIG: 


This  paper  presents  a  higher-order  module  calculus  that  satisfies  all 
of  the  above  properties.  We  show  that  fully  transparent  higher- 
order  functors  can  also  have  simple  type-theoretic  semantics  so 
they  can  be  added  into  ML-like  languages  while  still  support¬ 
ing  true  separate  compilation.  Our  key  idea  is  to  adapt  and  in¬ 
corporate  the  phase-splitting  interpretation  of  higher-order  mod¬ 
ules  [14,  36]  into  a  surface  module  calculus — the  result  is  a  new 
method  that  propagates  more  sharing  information  (across  functor 
application)  than  the  system  based  on  translucent  sums  [12]  and 
manifest  types  [19].  More  specifically,  given  a  signature  or  a  func¬ 
tor  signature  5,  we  extract  all  the  flexible  components  in  5  into  a 
single  higher-order  “type-constructor”  variable  u;  here,  by  flexible, 
we  mean  those  undefined  type  or  module  components  inside  S.  We 
call  such  u  as  the  flexroot  constructor  of  signature  S.  We  use  K  to 
denote  the  kind  of  u  and  S'  to  denote  the  instantiation  of  S  whose 
flexible  components  are  redirected  to  the  corresponding  entries  in 
u.  An  opaque  view  of  signature  S  can  be  modeled  as  an  existen¬ 
tial  type  3 u  :  K.  S' .  A  transparent  view  of  S  can  be  obtained  by 
substituting  the  flexroot  of  S  with  the  actual  constructor  informa¬ 
tion.  Full  transparency  is  then  achieved  by  propagating  the  flexroot 
information  through  functor  application. 

Our  new  phase-splitting  interpretation  also  leads  to  a  simpler 
type  theory  for  the  system  based  on  translucent  sums  and  manifest 
types.  Recent  work  on  phase-splitting  transformation  [14,  36,  8] 
has  shown  that  ML-like  module  languages  are  better  understood 
by  translating  them  into  an  F^-like  polymorphic  A-calculus.  These 
translations,  however,  do  not  support  opaque  modules  very  well 


signature  SIG  =  sig  type  t  val  x  :  t  end 
funsig  FSIG  =  fsig  (X:  SIG) :  SIG 

MacQueen  and  Tofte  use  strong  sum  E  to  express  the  module  type, 
so  signature  SIG  is  equivalent  to  a  dependent  sum  type  SIG  — 
E t.t  and  signature  FSIG  is  same  as  the  dependent  product  type 
III :  SIG. SIG.  We  also  define  a  structure  S  with  signature  SIG, 
and  two  functors  FI  and  F2,  both  with  signature  FSIG: 

structure  S  =  struct  type  t=int  val  x=l  end 

functor  FI  (X:  SIG)  = 

struct  type  t=X.t  val  x=X.x  end 
functor  F2  (X:  SIG)  = 

struct  type  t=int  val  x=l  end 

Although  SIG  does  not  define  the  actual  type  for  t,  functor  ap¬ 
plications  such  as  FI  (S)  will  always  re-elaborate  the  body  of  FI 
with  X  bound  to  S,  so  the  type  identity  of  X .  t  (which  is  int)  is 
faithfully  propagated  into  the  result  FI  (S)  .  Now  suppose  we  de¬ 
fine  the  following  higher-order  functor  which  takes  a  functor  F  as 
argument  and  applies  it  to  the  previously  defined  structure  S : 

functor  APPS  (F:  FSIG)  =  F (S) 

We  can  then  apply  APPS  to  functors  FI  and  F2: 


2 


structure  R  = 

struct  structure  R1  =  APPS(Fl) 
structure  R2  =  APPS(F2) 
val  res  =  (Rl.x  =  R2.x) 


end 

In  the  MacQueen-Tofte  system,  both  APPS  (FI )  and  APPS  (F2 ) 
will  re-elaborate  the  body  of  APPS  which  in  turn  re-elaborates  the 
functor  body  in  FI  and  F2;  it  successfully  infers  that  R1  .x  and 
R2  .  x  all  have  type  int,  so  the  equality  test  (Rl.x  =  R2  .  x) 
will  typecheck. 

MacQueen  and  Tofte  [26]  call  functors  such  as  APPS  as  fully 
transparent  modules  since  they  faithfully  propagate  all  sharing  in¬ 
formation  in  the  actual  argument  (e.g.,  FI  and  F2)  into  the  result 
(e.g.,  R1  and  R2).  Unfortunately,  their  scheme  does  not  support 
fully  syntactic  signatures.  If  we  want  to  turn  module  R  into  a  sep¬ 
arate  compilation  unit,  we  have  no  way  to  completely  specify  its 
import  interface.  More  specifically,  we  cannot  write  a  signature  for 
APPS  so  that  all  sharing  information  in  the  argument  is  propagated 
into  the  result.  The  closest  we  get  is  to  assign  APPS  with  signature: 

funsig  BADSIG  = 

fsig  (F:  FSIG) :  sig  type  t=int  val  x  :  t  end 

But  this  would  not  work  if  R  also  contains  the  following  code: 

functor  F3 (X:  SIG)  = 

struct  type  t=real  val  x=3 . 0  end 

structure  R3  =  APPS(F3) 

Signature  BADSIG  clearly  does  not  capture  the  sharing  informa¬ 
tion  propagated  during  the  application  of  APPS  (F3) .  The  actual 
implementation  of  the  MacQueen-Tofte  system  [36]  memoizes  a 
“skeleton”  for  each  functor  body  to  support  re-elaboration,  but  this 
is  clearly  too  complex  to  be  used  in  a  surface  signature  calculus. 

2.2  Translucent  sums  and  manifest  types 

A  more  severe  problem  of  the  MacQueen-Tofte  system  is  that  it 
lacks  a  clean  type-theoretic  semantics:  its  typechecker  must  use  an 
operational  stamp  generator  to  model  abstract  types;  this  makes  it 
impossible  to  express  the  typing  property  in  the  surface  signature 
calculus.  In  1994,  Harper  and  Lillibridge  [12]  and  Leroy  [19]  pro¬ 
posed  (independently)  to  use  translucent  sums  and  manifest  types 
to  model  ML  modules;  the  resulting  framework — which  we  call  it 
the  abstract  approach — has  a  clean  type-theoretic  equational  the¬ 
ory  on  types;  furthermore,  both  systems  support  fully  syntactic  sig¬ 
natures.  Leroy  [21]  and  Harper  [16]  have  also  shown  that  their  sys¬ 
tems  are  sufficiently  expressive  that  it  can  type  the  entire  module 
language  in  the  official  SML'97  Definition  [28]. 

Unfortunately,  in  the  case  of  higher-order  functors,  the  abstract 
approach  does  not  propagate  as  much  sharing  as  one  would  nor¬ 
mally  expect  in  the  MacQueen-Tofte  system.  For  example,  the 
previous  equality  test  (Rl.x  =  R2  .  x)  would  not  typecheck  in 
Harper  and  Leroy’s  systems  [12,  19],  In  fact,  the  abstract  approach 
treats  the  signature  SIG  as  an  existential  type  SIG  —  3t.t  and 
the  signature  FSIG  as  a  dependent  product  IIX  :  SIG.SIG.  The 
functor  APPS  is  assigned  with  the  following  signature  type: 

nF:(m :  SIG. SIG). SIG 

Applying  APPS  to  FI  or  F 2  always  yields  a  new  existential  pack¬ 
age  3 t.t  so  R1 .  t  and  R2  .  t  are  two  distinct  abstract  types. 


The  abstract  approach  relies  on  signature  subsumption  and 
strengthening  [12,  19]  to  propagate  sharing  information  from  the 
functor  argument  to  the  result.  But  the  subsumption  rules  are  not 
powerful  enough  to  support  fully  transparent  higher-order  functors. 
Nevertheless,  the  abstract  approach  does  have  fully  syntactic  signa¬ 
tures;  and  having  a  functor  parameter  returning  an  abstract  result  is 
sometimes  useful.  Take  the  functor  APPS  as  an  example,  some¬ 
times  we  indeed  want  the  parameter  F  to  be  a  functor  that  always 
generate  new  types  at  each  application. 

2.3  Transparent  modules  with  syntactic  signatures 

We  would  like  to  extend  the  abstract  approach  to  support  fully 
transparent  higher-order  functors.  Our  key  idea  is  to  adapt  and 
incorporate  the  phase-splitting  interpretation  of  higher-order  mod¬ 
ules  [14.  36]  into  a  surface  module  calculus;  the  result  is  a  new 
method  that  propagates  more  sharing  information  (across  functor 
application)  than  the  system  based  on  translucent  sums  and  mani¬ 
fest  types.  Given  a  signature  or  a  functor  signature  S,  we  extract 
all  the  flexible  components  in  S  into  a  single  higher-order  “type- 
constructor”  variable  u:  here,  by  flexible,  we  mean  those  undefined 
type  or  module  components  inside  5.  We  call  such  u  as  the  flexroot 
constructor  of  signature  S.  We  use  K  to  denote  the  kind  of  u  and 
S'  to  denote  the  instantiation  of  S  with  all  of  its  flexible  compo¬ 
nents  referring  to  the  corresponding  entries  in  u.  An  opaque  view 
of  signature  S  can  be  modeled  as  an  existential  type  3u  :  K.  S' ; 
a  transparent  view  of  S  can  be  obtained  by  substituting  all  occur¬ 
rences  of  u  in  S'  by  an  actual  flexroot  constructor.  For  example, 
the  previous  signature  declaration: 

signature  SIG  =  sig  type  t  val  x  :  t  end 

can  be  viewed  as  a  template  of  form: 

SIG  =  \u:  K s/G.(sig  type  t  =  #t (u)  val  x  :  t  end) 

where  kind  Ksig  is  equal  to  {t :  fl}.  We  use  #t(u )  to  denote  the 
t  component  from  a  constructor  record  u — this  is  to  emphasize  its 
difference  from  the  module  access  paths  in  ML  (e.g.,  X .  t). 

Instantiating  the  flexroot  of  SIG  with  constructor  {t  —  int} 
yields  a  signature  of  form: 

sig  type  t  =  int  val  x  :  t  end 

Meanwhile,  the  following  SML  code: 

structure  X  :>  SIG  = 

struct  type  t=int  val  x=l  end 

creates  an  opaque  view  of  SIG  so  module  X  has  type  3u  : 
Ksig -(SIG[u]),  or  expanded  to: 

3u :  KsiG.{siq  type  t  =  #t (u)  val  x  :  t  end). 

In  the  rest  of  this  paper,  we  follow  the  abstract  approach  to  treat 
signature  matching  as  opaque  by  default.  Given  a  module  identifier 
X  and  a  signature  S.  we  say  that  X  has  signature  S  if  X  has  type 
3u  :  Ars.(5[-M]).  The  abstract  flexroot  constructor  in  X  can  be 
retrieved  using  dot  notation  on  existentials  [5] — such  notation  is 
usually  written  as  X.typ,  but  in  this  paper  we  use  a  more  concise 
notation;  we  will  use  the  overlined  identifier  X  to  represent  the 
flexroot  of  X. 

It  is  informative  to  compare  flexroot  with  the  notion  of  access 
paths  in  the  abstract  approach  [12,  19].  A  type  path  X.t  in  the 
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system  based  on  translucent  sums  and  manifest  types  may  denote 
an  abstract  type  (as  in  dot  notation).  Under  the  flexroot  notation, 
X.t  always  refers  to  an  actual  type  definition — the  t  component 
of  module  X — which  in  turn  is  defined  as  type  #t(X);  in  other 
words,  all  the  flexible  components  in  X  are  now  redirected  to  the 
abstract  flexroot  constructor  X . 

Combining  all  the  flexible  components  into  a  single  flex¬ 
root  constructor  makes  it  easier  to  propagate  sharing  information 
through  functor  application.  For  example,  the  earlier  ML  code: 

functor  FI (X:  SIG)  = 

struct  type  t=X.t  val  x=X.x  end 

creates  a  functor  with  typer 

IIX:  (3u.SIG[u]).(siq  type  t  =  X.t  val  x  :  t  end) 

or  written  as  signature: 

fsig  (X:  SIG) :  sig  type  t  =  X.t  val  x  :  t  end 

Here,  the  type  path  X .  t  in  the  result  signature  really  refers  to  type 
#t(x).  During  functor  application,  we  create  a  transparent  view 
of  the  actual  argument  following  signature  SIG;  we  instantiate  the 
flexroot  X  into  an  actual  constructor  and  then  propagate  this  infor¬ 
mation  into  the  result  signature. 

The  idea  gets  more  interesting  in  the  higher-order  case.  Be¬ 
cause  all  functors  are  abstract  under  the  abstract  approach,  we  first 
need  to  find  a  way  to  introduce  transparent  higher-order  functors. 
SML'97  uses  the  and  notation  to  distinguish  between 
transparent  and  opaque  signature  matching;  we  borrow  the  same 
notation  and  use  it  to  specify  the  abstract  and  transparent  functors. 
In  the  following  example, 

funsig  NFSIG  =  fsig  (X:  SIG) :>  SIG 
funsig  TFSIG  =  fsig  (X:  SIG) :  SIG 

Signature  NFSIG  represents  an  abstract  functor  that  always  creates 
fresh  new  types  at  each  application.  Signature  TFSIG  represents  a 
fully  transparent  functor  that  always  propagates  the  sharing  infor¬ 
mation  from  the  actual  argument  (i.e.,  X)  into  the  result. 

The  definition  of  NFSIG  introduces  a  template  of  form: 

NFSIG  =  \u-i  : Knfsig.UX:(3u2.SIG[u2]).(3u3.SIG[u3]) 

where  kind  Knfsig  is  just  Ksig  — >  {}.  Notice  NFSIG  does  not 
propagate  any  sharing  information  (ui )  into  the  result  signature;  in¬ 
stead,  each  functor  application  always  returns  an  existential  pack¬ 
age.  For  example,  the  abstract  version  of  functor  APPS: 

functor  APPS  (F:  NFSIG)  =  F(S) 

is  assigned  with  the  following  interface  type: 

IIF :  (3ui ,NFSIG[ui]) ,(3u2. SIG[u2]) 
or  written  as  signature: 

fsig  (F:  NFSIG) :>  SIG 

On  the  other  hand,  the  definition  of  TFSIG  introduces  a  tem¬ 
plate  of  form: 

2we  omitted  the  kind  annotation  for  u  to  simplify  the  presentation;  we  will  do  the 
same  in  the  rest  of  the  paper  if  the  kind  is  clear  from  the  context. 


TFSIG  =  Aui  :  A'TFS/G.nx:(3u2.S,/G[tt2]).(S'/G[ui[x]]) 

where  kind  Ktfsig  is  equal  to  Ksig  — ►  Ksig  (the  algorithm  cal¬ 
culating  such  kind  is  given  later  in  Section  3.2).  The  flexroot  of 
TFSIG  has  a  different  kind  from  that  of  NFSIG  because  functors 
with  signature  TFSIG  propagate  more  sharing  information  (e.g., 
constructor  of  kind  Ktfsig)  than  those  with  signature  NFSIG. 
Notice  how  functor  application  propagates  sharing  into  the  return 
result:  the  flexroot  of  the  result  is  ui  [x]  where  u\  is  the  flexroot  of 
the  functor  itself  and  X  is  the  flexroot  of  the  actual  argument. 

We  can  now  write  the  fully  transparent  version  of  APPS  as: 

functor  APPS  (F:  TFSIG)  =  F(S) 

and  we  can  assign  it  with  the  following  interface  type: 

IIF :  (3«i :  Kt$%jg- TFSIG[ui]).(SIG[f  [{t  =  S  .  t}]J) 

or  if  we  write  it  in  an  extended  signature  calculus: 

fsig  (F:  TFSIG):  sig  type  t=  #t  (F [ { t=S . t } ] ) 
val  x:  t 

end 

With  proper  syntactic  hacks,  this  signature  can  even  be  written  as: 

fsig  (F:  TFSIG):  sig  type  t  =  #t (F (S) ) 
val  x  :  t 

end 

as  long  as  we  assume  that  all  module  identifiers  (e.g.,  F  and  S) 
referred  inside  the  constructor  context  #t(-)  are  always  referring 
to  their  constructor  counterparts. 

Getting  back  to  the  earlier  example  in  Section  2.1  where  we 
apply  APPS  to  functors  FI  and  F2,  we  see  why  both  R1  .t  and 
R2  .  t  are  now  equivalent  to  int.  To  apply  APPS  to  FI  (or  F2), 
we  match  FI  (or  F2)  against  TFSIG  and  calculate  the  flexroot  F 
of  the  actual  argument;  F  is  equal  to  Au.{t  —  #t(u)}  for  FI  or 
Ati.{t  =  int}  for  F2;  in  both  cases,  the  t  component  of  the  result 
is  $T(F[{t  =  int}])  which  ends  up  as  int. 

2.4  Relationship  with  Leroy’s  applicative  functors 

Our  syntactic  signature  looks  similar  to  Leroy’s  applicative-functor 
approach  [20]  where  he  can  also  assign  APPS  with  a  signature: 

fsig  (F:  FSIG):  sig  type  t=F(S).t 
val  x:  t 

end 

This  similarity,  however,  stays  only  at  the  surface;  the  underlying 
interpretations  of  the  two  are  completely  different.  Under  the  ap¬ 
plicative  approach,  a  functor  with  signature  FSIG  will  always  gen¬ 
erate  the  same  abstract  type  if  applied  to  the  same  argument.  Un¬ 
der  our  scheme,  an  abstract  functor  (with  signature  NFSIG)  always 
generates  a  new  type  at  each  application  while  a  transparent  func¬ 
tor  (with  signature  TFSIG)  does  not.  We  can  simulate  applicative 
functors  by  opaquely  matching  a  functor  against  a  transparent  func¬ 
tor  signature.  For  example, 

functor  F3  :>  TFSIG  =  FI 

Functor  F3  would  have  type: 

3ui :  K tfs/g -fix :  (3m2-'SIG[m2])-('S7G[«i  [x]]) 
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Module  expression  and  declaration: 

ctxt  formation 

F  T  ok 

ctyp  formation 

T  F  r 

path  p 

Xi  p.x 

cexp  formation 

T  F  e  :  r 

mexp  m 

p  |  str  di,  ■  ■  . ,  dn  end 

spec  formation 

r  f  if 

[  fct {xi\S)m  \pi (p2) 

sig  formation 

r  f  s 

\  (p  :>  S)  |  let  d  in  m 

mexp  formation 

r  F  m  :  S 

mdec  d 

\l 

11^ 

\l 

$ 

jl 

mdec  formation 

r  F  d:  H 

Module  signature  and  specification: 

ctyp  equivalence 

r  f  t  =  t' 

spec  subsumption 

r  F  H  <  H' 

sig  S 

::=  sig  Hi, H„  end 

sig  subsumption 

r  f  s<  s' 

|  fsig (Xi:  S):>  S' 

spec  H 

X0.S  |  U  |  U—T  |  Vi'.T 

Figure  3:  Static  semantics  for  AMC:  a  summary 

Core  language: 


ctyp  t  ti\p.t\...  Hj/p=>Hj  for  j  —  1, . . . ,  n 

cexp  e  Vi  |  p.v  |  . . . 

Elaboration  context: 

ctxt  r  ::=  £  |  T;  H 


Figure  2:  Syntax  of  the  abstract  module  calculus  AMC 


Because  F3  is  abstracted  over  its  flexroot  information,  applying  F3 
to  equivalent  constructors  will  still  result  in  equivalent  types  (e.g., 

#t(F3[{t  =  int}])). 

One  problem  of  the  applicative  approach  is  that  it  solely  relies 
on  access  paths  to  propagate  sharing.  Because  access  paths  are  not 
allowed  to  contain  arbitrary  module  expressions  (doing  otherwise 
may  break  abstraction),  the  applicative  approach  cannot  give  an 
accurate  signature  to  the  following  functor: 


(sig  Hi, . . . ,  Hn  end )/p  =>  (sig  H[, .  . . ,  H'n  end) 
(f  sig(a;j :  5)  :>  S')/p  =>  (f  sig(aii :  S)  :>  S') 
S/p.x  =>  S' 

7 - gw  , - ATT  (vi:T)  p=>(Vi:T) 

(xi :  S)/p  =>  {Xi-.S  ) 

{U—t)/p  =>  (U-t)  (ti)/p  =>  (ti-p.t) 
Figure  4:  Signature  strengthening  in  AMC 


first  define  an  abstract  module  calculus  that  reviews  the  main  ideas 
behind  translucent  sums  and  manifest  types.  We  then  present  our 
new  EMC  calculus  and  show  how  it  propagates  more  sharing  than 
the  abstract  approach. 


functor  PAPP  (F  :  FSIG)  (X  :  SIG)  = 
let  structure  Y  = 

struct  type  t  =  X.t  *  X.t 
val  x  =  X . x  *  X . x 

end 

in  F (Y) 

Leroy  [20]  did  propose  to  type  PAPP  by  “lambda-lifting”  module 
Y  out  of  PAPP,  but  this  dramatically  alters  the  program  structure, 
making  the  module  language  impractical  to  program  with. 

Our  approach  uses  the  flexroot  constructor  to  propagate  shar¬ 
ing.  We  can  easily  give  PAPP  an  accurate  signature: 

f sig  (F  :  TFSIG)  (X  :  SIG)  : 

sig  type  t  =  #t  (F  ( { t=  #t (X)  *  #t (X) } ) ) 
val  x  :  t 

end 

Notice  we  use  TFSIG  rather  than  NFS  I G  to  emphasize  that  F  is  a 
transparent  functor. 

3  Formalization 

In  this  section  we  present  an  Extended  Module  Calculus  (EMC) 
that  supports  both  fully  transparent  higher-order  functors  and  fully 
syntactic  signatures.  EMC  is  an  extension  of  Leroy  and  Harper's 
abstract  module  calculus  [19,  21,  12]  but  with  support  for  fully 
transparent  functors.  To  make  the  presentation  easier  to  follow,  we 


3.1  The  abstract  module  calculus  AMC 

We  use  the  Abstract  Module  Calculus  (AMC)  [19]  as  a  represen¬ 
tative  of  the  system  based  on  translucent  sums  [12]  and  manifest 
types  [19,  21].  The  syntax  of  AMC  is  given  in  Figure  2.  The  static 
semantics  for  AMC  is  summarized  in  Figure  3.  The  complete  typ¬ 
ing  rules  are  given  in  Figures  4  to  6  and  in  Appendix  A. 

AMC  is  a  typical  ML-style  module  calculus  containing  con¬ 
structs  such  as  module  expressions  (mexp),  module  declarations 
( mdec ),  module  access  paths  (path),  signatures  (sig,  specifications 
(spec),  core-language  types  (ctyp)  and  expressions  (cexp).  Follow¬ 
ing  Leroy  [21],  we  use  xi,  ti,  and  Vi  to  denote  module,  type,  and 
value  identifiers,  and  x,  t,  and  v  for  module,  type,  and  value  labels. 
We  assume  that  each  declaration  or  specification  in  AMC  simul¬ 
taneously  defines  an  internal  name  (e.g.,  i)  and  an  external  label 
(e.g.,  x,  t,  v).  Given  a  structure  str  d\, . .... ,  d„  end  or  a  signa¬ 
ture  sig  Hi, . . . ,  Hn  end,  declarations  and  specifications  defined 
later  can  refer  to  those  defined  earlier  using  the  internal  names. 
However,  to  access  the  module  components  from  outside,  we  must 
use  the  access  paths  such  as  p.x,  p.v,  and  p.t  where  p  is  another 
path  and  x,  v,  and  t  are  external  labels. 

Signatures  are  used  to  type  module  expressions.  An  AMC  sig¬ 
nature  can  be  either  a  functor  signature  or  a  regular  signature  that 
contains  an  ordered  list  of  module,  type,  and  value  specifications.  A 
functor  signature  is  written  as  f  sig(xj :  S)  :>  S'  where  S  denotes 
the  argument  signature  and  S'  the  result  signature.  We  borrow  the 
SML'97  notation  “ :  ”  and  “ :  >”  for  signature  matching  and  use  it  to 
specify  the  abstract  and  transparent  functors.  Because  AMC  only 
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i-  r  ok  Xi-.s  e  r 

r  b  Xi  :  S/xi 


r  I-  p  :  sig  Hi. .  .Hk,x'i :  S', . . .  end  p  —  {ti  hA  p.t,  Xi  hA  p.x  \  ti,  Xi  €  Dom(iTi. .  .Hk)} 

r  b  p.x'  :  p(S') 


_ r;  Xi :  5  I-  to  :  S' _  F  F  pi  -.  fsiqjxi:  S):>  S'  F\ ~  p2  ■  S"  F  b  S"  <  S 

T  I-  fct(xi:  S)m  :  fsig(xi:  S):>  S'  F  F  pi(p2) '■  {xi  p2}(S') 

_ b  F  ok _  T-Hi-...-Hk-i  F  dk:  Hk  k  =  l,...,n  F  F  p  :  S'  F  F  S'  <  S 

F  b  str  end  :  sig  end  F  b  str  di,...,dn  end  :  sig  Hi,,..,H„  end  F  b  (p  :>  S)  :  S 

r  b  S  r  b  d-.H  F;  H  b  m  :  S  F  b  to  :  5  F  b  r  F  b  e  :  r 

T  b  let  din  r«  :  S  F  b  (xi  —  to)  :  (xi :  S)  F  b  (ti  —  r)  :  ( U  —  t )  F  b  (vi  —  e)  :  ( Vi'.r ) 

Figure  5:  Selected  typing  rules  for  AMC:  T  b  m  :  S  and  r  b  d  :  H 


Rules  for  reflexivity  and  transitivity  are  omitted: 


All  in  AMC  (Figure  2)  plus: 


X  =  H2,...,H„  and  X'  —  H2i  , . . ,  H'm 
F  b  Hi  <  H[  T;  Hi  b  sig  X  end  <  sig  X'  end 

r  b  sig  Hi,  X  end  <  sig  H[,  X'  end 


X  =  Hi,...  ,  Hn  and  X'  —  H[, ... ,  HF 

T;  Ho  b  sig  X  end  <  sig  X'  end 

r  b  sig  Ho,  X  end  <  sig  X'  end 

r  b  S'i  <  Si  F;  Xj :  S[  b  S2  <  S'2 
T  b  (f  sig(xj :  Si)  :>  52)  <  (fsig(a;i :  5J)  :>  S2) 


T  b  S  <  S' 

F  b  (xuS)  <  ( Xi-.S ') 


r  b  t  =  t' 

F  b  {ti  —  r)  <  ( ti-r ') 


F  b  (U—t)  <  ti 


r  b  t  =  t' 

F  b  (Vi'.r)  <  ( Vi'.r ') 


Figure  6:  Signature  subsumption  in  AMC 


supports  abstract  functors,  a  functor  signature  in  AMC  always  uses 
:  >  to  specify  its  result.  Later  in  Section  3.2,  we’ll  extend  AMC 
with  transparent  functor  signatures  in  the  form  of  f  s  ig (xt :  S) :  S' . 

AMC  allows  two  kinds  of  type  specifications:  flexible  type 
specification  (ti)  and  manifest  type  specification  (ti  —  t).  Figure  6 
lists  the  standard  signature  subsumption  rules.  Manifest  types  can 
be  made  opaque  when  matched  against  flexible  specifications.  Sub¬ 
sumption  on  functor  signatures  is  contra-variant  on  the  argument 
but  covariant  on  the  result. 

Figure  5  lists  the  formation  rules  for  the  AMC  module  expres¬ 
sions  and  declarations.  AMC  supports  the  usual  set  of  module  con¬ 
structs  such  as  module  access  path  (p),  structure  definition  (str 
di, . . . ,  dn  end),  functor  definition  (f  ct(*j  :  S)m),  functor  ap¬ 
plication  (pi(p2)),  signature  matching  (p  :>  S),  and  the  let  ex¬ 
pression.  Most  of  the  typing  rules  for  AMC  are  straightforward: 
signature  matching  in  AMC  is  done  opaquely;  to  type  a  let  ex¬ 
pression,  the  result  signature  must  not  contain  any  references  to 
locally  defined  module  variables  (i.e.,  S  is  well  formed  in  context 
r  but  not  T;  H:  see  Figure  5). 

Type  sharing  in  AMC  is  propagated  through  signature  strength¬ 
ening  and  functor  application.  Signature  strengthening,  which  is 
defined  in  Figure  4,  is  a  variation  of  dot  notation  [4];  a  module 


sig  S  . . .  |  f  sig(x» :  S) :  S' 

ctyp  t  ::=  . . .  |  #f(C) 

ctxt  r  ::=  .  .  .  |  T;  u:  K 

Module  constructor  and  kind 

mcon  C  xi  \  u  \  Xw.K.C  \  Ci[C2\ 

|  {Fu...,Fn}\#x(C) 


mcfd  F 
mknd  K 
mkfd  Q 


—  t  —  T  |  X—C 
=  {Ql,...,Qn}\Kl  ^  K2 

=  t:Q\x:K 


Figure  7:  Syntax  of  the  extended  module  calculus  EMC 


identifier  Xi  of  signature  S  is  strengthened  to  have  signature  S/xi. 
Functor  application  (e.g.,  pi(p2))  can  propagate  the  sharing  in¬ 
formation  in  the  argument  (p2)  into  the  result  signature — this  is 
achieved  by  substituting  the  formal  parameter  Xi  with  the  actual 
argument  p2  (see  Figure  5). 

Unfortunately,  this  strengthening  procedure  has  no  effect  on  its 
functor  components.  In  the  higher-order  case,  functor  application 
in  AMC  does  not  propagate  as  much  sharing  as  one  would  normally 
expect  in  the  MacQueen-Tofte  system.  In  the  following,  we  show 
how  to  extend  AMC  to  support  fully  transparent  functors. 

3.2  The  extended  module  calculus  EMC 

The  extended  module  calculus  EMC  contains  the  same  set  of  mod¬ 
ule  expressions  and  declarations  as  those  in  AMC.  However,  EMC 
uses  a  different  method  to  propagate  sharing  information;  this  al¬ 
lows  EMC  to  support  fully  transparent  higher-order  functors.  EMC 
also  has  a  more  expressive  signature  calculus  so  that  all  functors  in 
EMC  have  fully  syntactic  signatures. 

The  syntax  of  EMC  is  given  in  Figure  7.  The  static  semantics 
for  EMC  is  summarized  in  Figure  8.  The  complete  typing  rules 
are  given  in  Figures  9  to  15  and  in  Appendix  B.  Our  typing  rules 
can  be  directly  turned  into  a  type-checking  algorithm  because  the 
signature  subsumption  rules  are  only  used  at  functor  application 
and  opaque  signature  matching  (the  same  is  true  for  AMC). 
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Similar  forms  as  those  for  AMC  (Figure  3)  plus: 


me  on  formation 
mefd  formation 
mcon  equivalence 
mefd  equivalence 
mknd  subsumption 
mlcfd  subsumption 


r  h  C  :  K 
r  I -  F:Q 
r  h  C=C'  :  K 
r  h  F  =  F'  :  Q 
F  K  <  K’ 

\-  Q  <  Q' 


Figure  8:  Static  semantics  for  EMC:  a  summary 


f  r  ok  u-.K  e  r  f  r  ok  xj-.se  r 
r  F  u  :  K  T  F  x7  :  knd(S) 

r  F  Fj  :  Qj  j  =  1, . . . ,  n 
T  F  {F1,...,Fn}:{Q1,...,Qn} 

r  F  C:  K'  K' =  {..., x\K,...} 
r  F  #35  (C)  :  K 

r  -u  :K  F  C:K' 
r  F  Xu-.K.C  :  K  -y  K' 

r  F  Cl  :  K  ->  K'  r  F  C2  :  K 
r  F  Cl  [C2\  ■■  K' 

r  F  r  r  F  C  :  K 

r  F  (t  —  r)  :  {t:fl)  T  F  (a:  =  C)  :  ( x:K ) 

Figure  9:  EMC  constructor  formation 


The  EMC  signature  calculus  contains  two  new  features  that  are 
not  present  in  AMC:  one  is  the  new  functor  signature  fsig(a;i  : 
S)  :  S'  used  to  specify  transparent  higher-order  functors;  another 
is  a  simple  constructor  calculus  that  captures  the  sharing  informa¬ 
tion  (using  constructor  C  and  kind  K)  and  a  new  type  expression 
#t(C)  that  selects  the  type  field  t  from  constructor  C. 

Transparent  functors  can  propagate  more  sharing  than  abstract 
functors.  For  example,  suppose  S  is  defined  as  s  ig  ti  ,  Xi'.ti  end, 
a  functor  with  signature  f  sig(a;i  :  5)  :>  S  corresponds  to  an  ab¬ 
stract  functor  whose  application  always  produces  a  module  with  a 
new  abstract  type  component  t.  On  the  other  hand,  a  functor  with 
signature  f  sig(xi :  S) :  S  corresponds  to  a  fully  transparent  func¬ 
tor  whose  application  always  propagates  the  type  information  from 
its  argument  into  its  result. 

The  constructor  calculus  itself  (see  Figure  7)  is  similar  to  those 
used  in  the  F^  -like  polymorphic  A-calculi.  In  this  paper,  we  assume 
all  types  in  the  core  language  have  kind  fl;  we  use  u  to  denote  con¬ 
structor  variables;  and  we  use  the  record  kind  {Q\, . . .  ,  Qn}  and 
function  kind  K\  —y  K2  to  type  module  constructors.  A  record 
constructor  consists  of  a  sequence  of  core-language  types  (marked 
by  label  t)  and  module  constructors  (marked  by  label  x).  Given 
a  record  constructor  C,  the  selection  form  #*(C)  is  a  module 
constructor  equivalent  to  the  x  field  of  C  while  #t(C)  is  a  core¬ 
language  type  expression  equivalent  to  the  t  field  of  C.  Figure  9 
gives  the  formation  rules  for  the  constructor  calculus;  other  typing 
rules  summarized  in  Figure  8  are  given  in  Appendix  B. 

The  constructor  calculus  is  designed  to  faithfully  capture  the 
sharing  information  inside  all  EMC  module  constructs.  More 


knd(sig  Hi . . .  Hn  end)  =>  {  knd(fTi) . . .  knd(iTn)  } 
knd(S) =>  K 

knd(fsig(a;j :  S )  :>  S')  =>  K  —¥  {} 

knd(S)  =>  K  knd(S')  =>  K' 
knd(f  sig(a;i :  S ) :  S')  =>  K  -*  K' 

knd(t,:)  =>  t. :  G  knd(ti  =  r)=>£ 

knd(x*:5)  =>  x:knd(5)  knd(ty  :  r)  =>  £ 

Figure  10:  EMC  kind  calculation  knd(S) 


S/C  is  a  shorthand  of  S/(C  :  knd(S)) 

Hj/(C-.K )=»gj  j  =  l,...,w 
(sig  iTi  . . .  Hn  end )/(C  :  K)  =>  sig  i3i  . .  .  H'n  end 

(fsig (xi:  S)  :>  S') / (C  :  K)  =>  fsiq(xi:  S):>  S' 

K  —  K'  -y  K"  S'/{C[x 7]  :  K”)  =>  S" 

(fsig (xi\S)\S')j(C  :  K)  =>  fsig(a u:S):>S" 

S/(#x(C)  :  K)  =»  5' 

(x, :  S)/(C  :  { . r:  A....})  >  (a*:  S') 

(t4)/(C:  =  #*(C)) 

(ti  —  r)/(C  :  K)  ( ti  —  T )  (Vj  :r)/(C  :  K)  =>  ( Vi'.r ) 

Figure  11:  Signature  strengthening  in  EMC 


specifically,  given  a  signature  (or  a  functor  signature)  S,  we  extract 
all  the  flexible  components  in  5  into  a  single  constructor  variable 
w,  we  call  such  u  as  the  flexroot  constructor  of  signature  S.  We 
use  K  to  denote  the  kind  of  u  and  S'  to  denote  the  instantiation  of 
S  whose  flexible  components  are  redirected  to  the  corresponding 
entries  in  u.  An  opaque  view  of  signature  S  can  be  modeled  as 
an  existential  type  3it :  K.  S' .  A  transparent  view  of  S  can  be  ob¬ 
tained  by  substituting  the  flexroot  of  S  with  the  actual  constructor 
information.  Full  transparency  is  then  achieved  by  propagating  the 
flexroot  information  through  functor  application. 

Both  K  and  S'  can  be  calculated  easily.  Figure  10  shows  how  to 
deduce  knd(S) —  the  kind  of  the  flexroot  constructor  of  a  module 
with  signature  S.  Here,  knd(S)  =>  K  means  that  the  flexible 
constructor  part  of  signature  5  is  of  kind  K  and  knd(iT)  =>  Q* 
means  that  the  flexible  part  in  specification  H  is  of  kind  field  Q* 
(which  denotes  either  Q  or  empty  field  e).  Notice  in  addition  to 
the  flexible  type  specifications  (ti),  functor  specifications  are  also 
considered  as  the  flexible  components.  A  transparent  functor  with 
signature  f  sig(a;i :  S) :  S'  is  treated  as  a  higher-order  constructor 
of  kind  K  -y  K'  where  K  and  K '  are  the  kinds  for  S  and  S' .  An 
abstract  functor  with  signature  fsig(a;j  :  S)  :  S'  is  treated  as  a 
dummy  constructor  that  returns  an  empty  record  kind. 

Signature  S'  is  calculated  using  a  procedure  similar  to  the  idea 
of  signature  strengthening,  but  signature  strengthening  in  EMC  is 
very  different  from  that  in  AMC:  instead  of  relying  on  the  access 
path  p  to  propagate  sharing,  EMC  uses  the  flexroot  constructor  to 
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Only  two  of  these  rules — module  identifier  and  functor  application — are  different  from  those  for  AMC  (Figure  5): 


h  T  ok  Xi'.S  €  r  r  h  p  :  sig  Hi. . . Hk,x'i :  S', . . .  end  p  —  {ti  i->  p.t,  Xi  >->■  p.x  \  U,  Xi  6  Dom(fTi. .  .Hk)} 

r  h  Xi  :  S/xi  r  I-  p.x'  :  p(S') 

_ r;  Xi :  S  b  to  :  S' _  r  h  pi  :  fsi q{xj:S):>S'  F  b  p2  :  S"  F  b  S"  <  S  F  b  5"  4-knd(S)  =>  C 

F  b  f  ct(a:i :  S)m  :  f  sig(xi  :S):>  S'  r  h  pi  (P2)  :  {*7  i-A  C,  Xi  hA  p2}(5') 

_ b  r  ok _  r;  Hi', . . . ;  -Hfc-i  b  dh  :  H&  fc  =  l,...,w  r  b  p  :  5'  r  b  S'  <  S 

T  h  str  end  :  sig  end  T  b  str  di,...,dn  end  :  sig  Hi, .  .  .  ,  Hn  end  T  b  (p  :>  S)  :  S 

rb5  r  b  d:  H  F;H  b  m:  S  F  h  m:  S  F  b  r  T  b  e  :  r 

T  b  let  din  to  :  5  r  b  (a;*  =  to)  :  (ay :  S)  r  b  (tj  =  r)  :  (U  —  t)  F  b  (vi  —  e):  ( Vf.T ) 

Figure  12:  Selected  typing  rules  for  EMC:  r  b  m  :  S’  and  r  b  d  :  H 


All  subsumption  rules  in  AMC  ( Figure  6 )  plus: 

r  b  S[  <  Si  r;  Xi :  S[  b  S2  <  5a 
T  b  (fsiq(xi:  Si):  S2)  <  (£siq(xi:  S'l):  S2) 

T  b  (f  sig(xj :  Si) :  S2)  <  (f  siq(xj :  Si)  :>  S2) 

S2  is  an  instantiated  signature 
T  b  (f sig(xi :  5i)  :>  S2)  <  (f  siq(x, :  Si) :  S2) 

Figure  13:  Signature  subsumption  in  EMC 


strengthen  a  signature.  Given  a  signature  5  and  a  constructor  C 
of  kind  knd(S),  signature  strengthening  S/C  returns  the  result 
of  substituting  the  flexroot  constructor  in  S  with  C.  We  use  the 
auxiliary  procedures  given  in  Figure  11  to  deduce  S/C.  Here. 
S/(C  :  K )  =>  S'  means  that  instantiating  S  by  constructor  C 
of  kind  K  yields  signature  S' ,  and  H/(C  :  K )  =>  H'  means  that 
strengthening  specification  H  by  constructor  C  of  kind  K  yields 
specification  H' .  The  additional  kind  parameter  is  used  to  identify 
the  flexible  components  in  a  signature. 

Signature  strengthening  produces  a  special  form  of  signature 
whose  type  components  are  fully  defined  and  whose  functor  com¬ 
ponents  have  abstract  result  signatures.  This  special  form,  which 
we  call  it  instantiated  signature,  can  be  accurately  defined  using 
the  following  grammar: 

S1  sig  H[ , . . . ,  H//  end 

|  fsig  (xi\S)\>S 

H1  ::=  Xi’.S1  |  ti  —  T  |  Vi’.T 

Notice  under  this  special  form,  the  argument  of  a  functor  signature 
could  still  be  an  arbitrary  EMC  signature,  but  the  result  must  al¬ 
ways  be  abstract.  The  following  lemma  can  be  proved  by  structural 
induction  on  the  EMC  signatures: 

Lemma  3.1  Given  an  EMC  context  T,  a  signature  S,  a  kind  K, 
and  a  constructor  C,  if  T  b  S  and  r  b  C  :  K  and  b  K  < 
knd(5)  then  S/C  is  an  instantiated  signature  and  T  b  S/C. 

Figure  12  gives  the  typing  rules  for  the  EMC  module  expres¬ 
sions  and  declarations.  Intuitively,  we  say  a  module  expression  to 
has  signature  S  if  to  has  type  equal  to  3 u :  knd (S).(S/u).  Given 


r  b  HjfK  =>F*  j  -  l, ... . :  n 
T  b  (sig  Hi...  Hn  end)) K  =>  {F?  . . .  Ff} 

\-  K  <  knd(S)  T-,Xi:S  b  S’ \K’  =>  C’ 

C  =  Xu:  K.{x(  vq-  u}(C')  T  b  C  :  K  -y  K' 

r  b  (fs±q(xi:S):>S,)i,{K^K')=>C 

_ r  b  sjK  =»  c _ 

r  b  (xi:S)l{. .  .  ,x:K,. . .}  =>  (x  =  C) 

_ r  b  r _ 

r  b  (t  i  —  r)  4.  {...  ,t:Q,  ...}=>  (t  —  r) 

For  all  other  cases,  F  b  HfK=>£ 

Figure  14:  Narrowing  instantiated  signatures  in  EMC 


a  module  Xi  of  signature  S.  we  use  the  overlined  identifier  'xi  to 
refer  to  the  flexroot  constructor  hidden  inside  x* .  This  is  a  form  of 
dot  notation  [5]  where  x7  represents  the  abstract  type  defined  by  the 
existential  package  x In  AMC,  signature  strengthening  is  applied 
to  the  access  identifier  (xf)  itself  and  hidden  type  components  are 
represented  using  access  paths  (p).  EMC  generalizes  this  idea  so  it 
can  propagates  more  sharing  than  AMC  does. 

Figure  13  gives  the  additional  signature  subsumption  rules  for 
the  EMC  signatures.  Subsumption  on  transparent  functor  signa¬ 
tures  is  also  contra- variant  on  the  argument  and  covariant  on  the  re¬ 
sult.  More  interestingly,  a  transparent  signature  f  sig(xj :  Si):  S2 
is  a  subtype  of  its  abstract  counterpart  f  sig(xi :  Si)  :>  S2:  this  is 
because  we  can  always  coerce  a  transparent  functor  into  an  abstract 
one  by  blocking  all  of  its  sharing  information.  Finally,  an  abstract 
signature  f  sig(x» :  Si)  :>  S2  is  a  subtype  of  its  transparent  coun¬ 
terpart  if  the  result  S2  is  an  instantiated  signature;  this  corresponds 
to  the  special  case  where  the  abstract  version  only  hides  a  dummy 
constructor  so  it  should  be  equivalent  to  the  transparent  version. 

More  specifically,  a  kind  K  is  a  dummy  kind  if  it  is  {},  or 
Ki  —y  K2  where  K 2  is  a  dummy  kind,  or  {Q 1 , . . . ,  Qn}  where  all 
fields  Qi  have  dummy  kinds.  Given  a  context  T  and  a  constructor 
C.  we  say  C  is  a  dummy  constructor  if  T  b  C  :  K  and  K  is  a 
dummy  kind.  A  dummy  constructor  conveys  no  useful  information 
thus  it  can  be  safely  eliminated.  It  is  easy  to  show  that  if  S  is  an 
instantiated  signature  then  knd(S)  is  a  dummy  kind. 


8 


_ r  F  54-5'  =>Cj_K _  r  V  t  =  t' _  T  F  r  =  r' 

r  h  (xi :  S)  4-  (xi :  S')  =>  (x  —  C)  :  (x:  K)  r  F  ((,:  =  r)  {  (ti -t')  =>  e  :  e  Y  F  (vt :  r)  {.  (vt :  r')  =>  £  :  £ 

r  F  =>  (t  —  r)  :  (t:Q)  T  F  (sig  end),((sig  end)  =>{}:{} 

T  F  HilHi  =>F?  :  Qi  Y;H i  F  (sig  H2  . . .  Hn  end)}(sig  H'2  . . .  H'm  end)  =>  {Fs}  :  {Qs}  Y  F  {Fs}  :  {Qs} 

r  h  (sig  Hi  . . .  Hn  end)4.(sig  H[  . .  .  H'm  end)  =4-  {-Fi*,  Fs}  :  {Q{,  Qs} 

r;  Ho  F  (sig  Hi  ...Hn  end)  |  (sig  H[  . .  .  H'm  end)  =4  C  :  K  r  F  C  :  K 

Y  F  (sig  Ho  ■  ■  ■  H„  end)  {(sig  H[  . . .  H'm  end)  =4  C  :  K 

rhSi<Si  r;  Xi :  S[  F  S2  <  S'2  C=  A«:knd(SO.{}  K  =  knd(S{)  ->  {} 

T  F  (f  sig(a;i :  5i)  op  S2)  {(f  sig(a;j :  S{ )  :>  S2)  =4-  C  :  K  where  op  is  either  :  or  :> 

rhSi<Si  1-  S2IS’2  =>  C2  ■■  K2  C7  =  AM:knd(5i)-{®7^M}(C72)  K  —  knd(5i)  —¥  K2  V  b  C  :  K 

T  I-  (fsig(xj:5i)  op  52) -|(f  sig(xj :  5( ) :  *52)  =>  C  :  K  where  op  is  either  :  or  :> 

Figure  15:  Narrowing  instantiated  signatures  in  EMC  (alternative  version) 


Only  two  of  the  typing  rules  in  Figure  12  are  different  from 
those  for  AMC  (in  Figure  5):  one  for  module  identifier  and  another 
for  functor  application.  To  access  a  module  identifier  x i,  we  al¬ 
ways  strengthen  it  with  its  flexroot  constructor  xi.  To  type  functor 
application  pi  (p 2).  we  first  notice  that  the  typing  rules  for  access 
paths  (in  Figure  12)  satisfies  the  following  property:  if  T  F  p  :  S, 
then  S  is  an  instantiated  signature.  This  observation  can  be  easily 
established  via  Lemma  3.1.  So  we  can  assume  pi  has  signature 
f  sig(x* :  5)  :>  S'  andp2  has  signature  S"\  furthermore,  S "  is  an 
instantiated  signature.  Typing  pi  (P2)  then  involves  checking  if  S" 
subsumes  S ,  extracting  the  actual  flexroot  information  in  P2  (let’s 
call  it  C),  and  substituting  all  instances  of  x7  in  S'  with  construc¬ 
tor  C  and  all  instances  of  Xi  (not  counting  Hi)  with  access  pathp2. 
Here,  the  substitution  on  Hi  is  the  key  on  why  can  propagate  more 
sharing  and  support  fully  transparent  higher-order  functors. 

Constructor  C  can  be  extracted  from  the  actual  argument  sig¬ 
nature  S"  of  P2  using  the  signature-narrowing  procedure  defined 
in  Figure  14.  This  procedure  is  initially  invoked  upon  instantiated 
signatures  only.  Given  a  context  T,  the  deduction  T  F  S  K  =4- 
C  extracts  the  type  components  from  an  instantiated  signature  S 
and  produces  a  constructor  C  of  kind  K\  the  specification  coun¬ 
terpart  F  F  H  4,  K  F*  extracts  the  type  components  in 
H  and  produces  either  F  or  empty  field  e.  The  side  condition 
r  F  C  :  K  ensures  that  C  only  contains  identifiers  defined  in 
r.  We  can  prove  the  following  lemma  using  structural  induction  on 
the  EMC  signatures: 

Lemma  3.2  Given  an  EMC  context  T,  a  signature  S,  and  an  in¬ 
stantiated  signature  S " ,  let  K  —  knd(S),  if  Y  F  S"  <  S 
and  T  F  S"  XK  =>  C.  then  T  F  C  :  K. 

Figure  15  gives  an  alternative  signature  narrowing  procedure. 
This  procedure  is  defined  over  arbitrary  signatures,  but  it  is  initially 
invoked  upon  a  instantiated  signature  only.  Given  a  context  T,  the 
deduction  r  F  SX  S'  =>  C  :  K  extracts  the  type  components  from 
an  instantiated  signature  S  and  produces  a  flexroot  constructor  C 
of  kind  K  (for  signature  S' )\  the  specification  counterpart  r  F  H  X 
H'  =>  F*  :  Q*  extracts  the  type  components  in  H  and  produces 
either  F  of  kind  Q  or  empty  field  e.  We  use  Fs  and  Qs  to  denote  a 
sequence  of  constructor  fields  and  kind  fields.  The  side  conditions 
r  F  C  :  K  and  Y  F  {Fs}  :  {Qs}  ensures  that  C  and  Fs 
only  contain  identifiers  defined  in  T.  It  is  easy  to  show  that  two 
signature  narrowing  procedures  produce  equivalent  results. 


This  alternative  signature  narrowing  procedure  can  also  be  used 
to  verify  the  signature  subsumption  relation  defined  in  Figure  13. 
To  check  if  a  functor  signature  5  is  a  subtype  of  another  signature 
S',  we  first  compare  their  corresponding  argument  signatures,  then 
compare  their  result  signatures,  and  finally  in  the  case  that  S  is 
abstract  and  S'  is  transparent,  we  invoke  the  signature  narrowing 
procedure  in  Figure  15.  If  this  algorithm  does  not  get  stuck,  then  S 
is  a  sub-signature  of  S' . 

Lemma  3.3  Given  an  EMC  context  T,  a  signature  S,  and  an  in¬ 
stantiated  signature  S" ,  if  Y  F  S"  }  S  =>  C  :  K  then 
F  if  E  knd(S)  and  Y  F  C  :  K. 

Lemma  3.4  Given  an  EMC  context  T,  a  signature  S,  and  an  con¬ 
structor  C,  if  r  F  C  :  knd(S)  then  T  F  S/C  <  S. 

Lemma  3.5  Given  an  EMC  context  T,  two  signature  S  and  S', 
and  an  instantiated  signature  S" ,  if  T  F  S'  <  S"  and  T  F  S'”  {- 
S  =>  C  ■.  K  then  Y  F  S'  }S  =>  C  :  K 

Lemma  3.6  Given  an  EMC  context  T,  a  signature  S,  and  an  in¬ 
stantiated  signature  S",  then  T  F  S"  <  S  is  true  if  and  only  if 
r  F  S"  X  S  =>  C  :  K  succeeds. 

Proof:  For  the  “if”  part:  assuming  T  F  S"  }  S  =>  C  :  K. 
we  use  structural  induction  on  S"\  along  the  process,  we  create  a 
bridging  signature  S’=S  /  C  and  and  also  show  T  F  S"  <  S' ; 
this  bridge  and  the  fact  r  F  S'  <  S  (via  Lemma  3.4)  are  used  to 
connect  a  transparent  functor  signature  to  its  abstract  counterpart. 
For  the  “only  if”  part:  we  use  structural  induction  on  the  size  of  the 
instantiated  signature  S"  and  show  that  if  Y  F  S"  <  S  is  true  then 
r  F  S"  }  S  =4*  C  :  K  will  not  get  stuck.  The  only  nontrivial  case 
is  when  S"  is  an  abstract  functor  signature  f  sig(a:j  :  S'/)  :>  S/ 
while  S'  is  a  transparent  signature  f  sig(aii  :  Si)  :  S2;  because 
r  F  S"  <  S.  there  must  exist  an  instantiated  bridging  signature 
S'2  such  that  r;x,:Si  F  S /  <  S'2  and  T;  Xi :  Si  F  S"  <  S'2\ 
because  S 2  is  instantiated,  we  have  T  F  S2  }  S2  C2  :  K2  from 
the  induction  hypothesis  and  Y  F  S”  }  S2  =>■  C2  :  K2  from 
Lemma  3.5.  □ 

Given  an  EMC  context  T,  we  say  two  signatures  S  and  S'  are 
equivalent,  denoted  as  T  F  S  =  S',  if  and  only  if  both  Y  F  S  <  S' 
and  r  F  S'  <  S  are  true.  The  following  propositions  show  why 
the  typing  rules  for  EMC  can  hold  together: 
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Lemma  3.7  Given  an  EMC  context  T,  a  signature  S,  and  an  in¬ 
stantiated  signature  S " ,  assume  T  \~  S"  <  S  and  r  E  p2  :  S " 
and  T  h  S"4.knd(S)  C,  and  let  p  —  {xi  i->  C,Xi  P2}, 
then  (1)  given  two  type  expressions  n  and  T2,  if  T;  Xi :  S'  E  n  and 
T]Xi'.S  E  T2  and  T;  Xi :  S  E  n  =  T2  then  T  E  p(r 1)  =  p(r2); 
(2)  given  two  instantiated  signatures  Si  and  S2,  if  T;  Xi  :  S  E 
S[  and  T:  Xi  :  S  E  S'2  and  T;Xi  :  S  E  S[  =  S2  then 

r  e  p(so  =  P{S'2). 

Theorem  3.8  (unique  typing)  Given  an  EMC  context  T,  two  sig¬ 
natures  S  and  S' ,  and  a  module  expression  m,  if  T  E  m,  :  S  and 
r  I-  m  :  S'  then  T  E  S  =  S'. 

Proof:  Expand  this  theorem  to  cover  module  declarations  and  core 
language  expressions;  the  generalized  version  of  this  theorem  can 
be  proved  by  structural  induction  on  the  derivation  tree.  □ 

3.3  Discussions  and  extensions 

The  EMC  calculus  given  so  far  only  allows  abstract  or  fully  trans¬ 
parent  result  signatures.  We  could  extend  the  EMC  signature  cal¬ 
culus  further  to  support  partially  transparent  functors: 

sig  S  .  . .  |  fsig(a;j :  5)  :  pv(S",  K) 

Here,  pv  is  a  modifier  that  indicates  the  result  signature  is  partially 
transparent,  and  the  kind  K  is  used  to  fine  tune  the  amount  of  shar¬ 
ing  being  propagated  through  functor  application;  a  well-formed 
signature  must  have  E  knd(S')  <  K  so  that  the  kind  annotation 
actually  make  sense. 

More  aggressively,  we  could  instead  extend  EMC  with  an  ab¬ 
stract  module  specification  of  form: 

spec  H  . .  .  |  Xi  :>  S 

This  seems  to  be  less  adhoc  than  the  pv  keyword  but  it  makes  it 
harder  to  reuse  large  signatures  with  small  changes  of  transparency 
notations. 

EMC  can  also  be  extended  to  support  other  forms  of  module  ex¬ 
pressions  in  SML"97.  For  example,  in  SML’97,  the  let  expression 
(at  the  module  level)  allows  its  body  type  to  refer  to  the  new  type 
stamps  generated  in  the  let  declarations.  Also,  SML’97  supports 
transparent  signature  matching  such  as: 

structure  A  :  sig  type  t  val  f  :  t  end  = 
struct  abstype  s  =  ... 

type  t  =  s  ->  s 
fun  f  (x  :  s)  =  x 

end 

Here,  type  t  is  equivalent  to  s  —¥  s,  but  the  new  type  s  is  not 
exported.  Both  of  these  features  involve  exporting  values  and  types 
that  make  use  of  hidden  abstract  types.  While  it  is  doubtful  that 
such  extension  is  really  useful  in  practice,  we  can  support  it  easily 
by  extending  the  EMC  signature  calculus  with  the  following  new 
form  of  type  specifications: 

spec  H  ::=  .  .  .  |  hidden 

We  can  then  write  down  the  interface  Sa  for  A  as: 

sig  type  s  is  hidden 
type  t  =  s  ->  s 
val  f  :  t 

end 


which  in  turn  is  equivalent  to: 

3u :  /C4.(sig  type  t  =  #s  (u)  — >  #s(«)  val  f :  t  end) 

where  kind  Ka  is  just  {s  :  ft}.  In  other  words,  if  we  write  each 
signature  S  as  a  template  of  Xu :  K.S1 ,  the  hidden  type  specifica¬ 
tions  will  be  present  in  the  kind  K  but  not  in  the  body  signature  S' . 
Notice  before  this  extension,  K  is  always  equivalent  to  knd(S), 
so  all  components  in  K  are  always  present  in  S' . 

4  Expressiveness 

In  this  section,  we  show  that  both  the  translucent-sum-based  cal¬ 
culus  and  the  strong-sum-based  calculus  can  be  embedded  into  our 
EMC  calculus.  We  also  compare  EMC  with  the  stamp-based  se¬ 
mantics  of  the  MacQueen-Tofte  system  [26,  36]. 

4.1  The  abstract  module  calculus  AMC 

We  use  the  AMC  calculus  presented  in  Section  3.1  as  a  represen¬ 
tative  for  the  system  based  on  translucent  sums  [12]  and  manifest 
types  [19].  Because  AMC  is  a  subset  of  EMC,  the  translation  from 
AMC  to  EMC  (denoted  as  [-Ja)  is  just  an  identity  function,  We  can 
show  that  this  translation  [-Ja  maps  all  well  typed  AMC  programs 
into  well-typed  EMC  programs. 

Theorem  4.1  Given  an  AMC  context  T,  we  have 

•  if  E  r  ok  is  a  valid  AMC  deduction  then  E  [rj0  ok  is 
valid  in  EMC;  similarly, 

•  if  r  E  t  then  |TJa  E  |rja; 

•  if  T  E  e  :  r  then  [Tja  E  [eja  :  Lr_U’' 

•  if  T  E  S  then  |TJ0  E  |SJa; 

•  if  r  E  m  :  S  then  |Tja  E  |mja  :  [SJa/ 

.  if  V  E  d  :  H  then  |Tja  E  [d\a  :  [H\a; 

•  if  r  E  t  =  t'  then  |TJa  E  |rja  =  |r'Ja. 

•  if  T  \~  S  <  S'  then  |Tja  E  |SJa  <  |S'Ja- 

Proof:  By  structural  induction  on  the  derivation  tree.  The  main  dif¬ 
ference  between  EMC  and  AMC  is  the  way  how  module  identifiers 
and  functor  applications  are  typed.  For  the  case  of  module  iden¬ 
tifiers.  we  use  the  following  lemma  (Lemma  4.2);  for  the  case  of 
functor  application,  notice  the  result  of  any  AMC  functor  signature 
does  not  contain  any  reference  to  the  flexroot  constructor  'xi  so  the 
typing  rules  for  AMC  and  EMC  have  the  same  behavior.  □ 

Lemma  4.2  Given  an  AMC  context  T,  suppose  S  is  an  AMC  sig¬ 
nature  and  Xi  :  S  C  r,  then  [rja  E  [5/xiJa  =  YS\a/'xi  is  a 
valid  deduction  in  EMC. 

Proof:  Notice  S/xi  refers  to  the  strengthening  operation  for  AMC 
(as  in  Figure  4)  while  [SJo/isT  refers  to  the  strengthening  operation 
for  EMC  (as  in  Figure  11).  To  prove  this  lemma,  we  need  to  show 
the  following:  given  an  EMC  type  path  p.t ,  let  Xi  be  the  root  iden¬ 
tifier  in  p,  and  F{p )  denotes  the  EMC  constructor  x7  if  p  —  Xi ,  and 
#x'(F(p'))  ifp  =  p'  .x' ,  then  the  judgement  r  E  p.t  —  #t(F{p)) 
is  valid  in  EMC.  □ 
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path 

p  : 

X  |  771  (p)  |  772  (p) 

ctxt-to-ctxt  translation 

|TJ„  r 

mexp 

m  : 

:= 

P  |  iv(e)  |  Lt{p)  |  {x  —  mi ,  m2) 

ctsp-to-ctyp  translation 

\_p\n  l-y  T 

l 

Ax:  S.m  |  pi(p2)  |  let  x  —  mi  in  m2 

sig-to-sig  translation 

|sj„  s 

sig 

S  : 

V(/x)  |  TYP  |  Ex :  S1.S2  |  nx:5i.52 

cexp-to-cexp  translation 

jejn  t-y  e 

ctsp 

P  : 

:= 

77t(p)  I  •  •  • 

path-to-path  translation 

[pin  <->p 

cexp 

e  : 

:= 

77 V(P)  |  •  •  • 

mexp-to-mexp  translation 

[777j  n  1 — ^  777 

ctyp-to-ctyp  translation 

F  b  r  t' 

mtyp 

M  : 

:= 

v(r)  |  EQ(r)  |  Ex:  M1.M2  |  nx:L.M 

mtyp-to-sig  translation 

r  F  M^S 

L  : 

:= 

V(r)  j  TYP  |  Ex-.L1.L2  |  Ux-.L1.L2 

mtyp-to-sig  translation 

r  F  L^S 

ctyp 

r  : 

:= 

77 t(m')  |  . . . 

mtyp-to-kind  translation 

[M\c  >->  K 

ctme 

m  : 

x  |  tv(e!)  |  tt(r)  |  A x-.L.m' 
rn'i  (rn'2 )  |  letx=rw'i  in  TO2 

mtyp-to-kind  translation 

\_L\c  i->  K 

l 

mtyp-to-mcon  translation 

r  F  AI^C 

1 

(,x  —  777 1 , m2 )  |  771(777')  772(777') 

ctme-to-mcon  translation 

r  F  777'  :  M  c 

ctce 

e'  : 

:= 

77^(777')  |  . . . 

Figure  18:  Translation  from  TMC  to  EMC:  a  summary 

ctxt 

r  : 

•— 

£  1  Y-.x-.M  1  T-.x-.L 

Figure  16:  Syntax  of  the  transparent  module  calculus  TMC 


ctxt  formation 
mtyp  formation 
ctyp  formation 
ctme  formation 
ctce  formation 


f  r  ok 

r  h  M  and  V  F  L 

r  I hr 

r  F  rri  :  M 
r  h  e'  :  r 


cexp  formation 
mexp  formation 
sig  formation 
ctsp  formation 


r  h  e:r 
r  I -  m:  M 

r  i -  s 
r  i-  p 


ctyp  equivalence 
mtyp  equivalence 
mtyp  subsumption 
mtyp  strengthening 


r  i-  7i  =72 

r  F  JVIi  =.  M2  or  r  h  Li  EE  L2 

r  h  m<l 

L/m'  =>  M 


Figure  17:  Static  semantics  for  TMC:  a  summary 


4.2  The  transparent  module  calculus  TMC 

We  use  the  Transparent  Module  Calculus  (TMC)  as  a  representative 
of  the  strong-sum-based  approach.  The  syntax  of  TMC  is  given  in 
Figure  16;  the  static  semantics  is  summarized  in  Figure  17;  the 
complete  typing  rules  are  given  in  Appendix  C. 

Following  other  strong-sum-based  module  systems  [26, 28,  36], 
we  distinguish  module  signatures  (S)  from  module  types  (M  and 
L):  module  signatures  are  source-level  specifications  while  module 
types  are  semantic  objects  used  for  typechecking. 

A  module  signature  can  either  contain  a  single  value  specifi¬ 
cation  (v(/x)),  a  single  type  specification  (TYP),  or  a  pair  of  two 
other  module  components  (Ex  :  S1.S2);  it  can  also  be  a  functor 
signature  (IIx  :  S1.S2).  Only  simple  access  paths  (7 rt(p))  are  al¬ 
lowed  in  a  specification.3  An  L-shaped  module  type  is  like  a  mod¬ 
ule  signature  except  that  in  its  value  specification  v(t),  core  type  r 
can  contain  arbitrary  module  expressions  (m' ).  M-shaped  module 
types  are  slightly  different  from  L-shaped  ones:  they  allow  man¬ 
ifest  types  (or  type  abbreviations)  of  form  EQ(t)  but  no  flexible 
type  specification  of  form  TYP.  The  module  expression  m'  inside 
the  core  type  r  helps  achieve  the  fully  transparent  propagation  of 
the  sharing  information  in  TMC. 

3The  Standard  ML  signature  calculus  [28,  27]  enforces  a  similar  restriction. 


A  module  expression  in  TMC  can  either  be  an  access  path  (p),  a 
single-value-component  module  ( t„(e )),  a  single-type-component 
module  (tt(e)),  a  strong  sum  of  two  module  components  ({x  — 
mi,  m2)),  a  functor  (Ax  :  S.m),  a  functor  application  (pi(p2)),  or 
a  let  expression. 

To  simplify  the  presentation,  we  restrict  the  TMC  functor  appli¬ 
cation  to  work  on  simple  access  paths  only  (i.e.,  pi  (p2))-  Arbitrary 
functor  applications  (e.g.,  7771(7772))  can  just  be  A-normalized  into 
the  restricted  form  using  let  expressions.  We  also  do  not  support 
type  abbreviations  in  signatures.  We  insist  that  M  be  a  subtype  of 
L  if  they  have  same  number  of  components  (see  the  subtyping  rules 
r  F  M  <  L  in  Appendix  C  ).  These  restrictions  do  not  affect  the 
main  result  because  it  is  easy  (but  tedious)  to  extend  TMC  and  the 
TMC-to-EMC  translation  to  support  the  additional  features. 

Figure  18  summarizes  the  translation  from  TMC  to  EMC;  the 
actual  definition  is  given  in  Appendix  D.  Here,  [-J„  maps  TMC 
contexts,  core  types  (in  signatures),  signatures,  core  expressions, 
access  paths,  and  module  expressions  into  their  EMC  counterparts; 
[•Jc  maps  TMC  module  types  into  EMC  kinds.  The  translation 
from  TMC  types  to  EMC  types  is  based  on  the  type  formation  rules, 
so  the  judgement  T  F  r  t'  maps  the  TMC  core  type  r  into  an 
EMC  core  type  r' ;  the  judgements  T  F  M  S  and  r  F  L 
S  map  the  TMC  module  types  M  or  L  into  an  EMC  signature  S. 
We  also  use  judgements  rFAf^C  and  T  F  m!  :  M  C  to 
map  TMC  module  types  and  expressions  (embedded  inside  core 
types)  into  EMC  constructors.  We  can  prove  the  following  type 
preservation  theorem  for  the  TMC-to-EMC  translation: 

Theorem  4.3  Given  a  TMC  context  T,  we  have: 

•  if  F  r  ok  is  a  valid  deduction  in  TMC,  then  F  [rjn  ok  is 
valid  in  EMC;  similarly, 

•  if  r  F  p  then  |TJ„  F  [/xj n; 

•  if  T  F  S  then  [T\n  F  |SJ„; 

•  if  T  F  e  :  r  and  T  F  r  r'  then  [rjn  F  [e]n  :  t' ; 

•  if  T  F  p  :  M  and  T  F  M  S  then  |TJ„  F  [p\n  :  S; 

•  if  T  F  m  :  M  and  r  F  M  S  then  [rj  „  F  [m\  n  ■  S; 

•  if  T  F  r  t'  then  [rj  „  F  r' ; 

•  if  T  F  M~»  SorT  F  L  ^  S  then  |TJ„  F  S; 

•  if  T  F  M  ^  Si  and  T  F  L  S2  and  T  F  M  <  L 

and  |TJ„  F  Si||LJc  =>  C  then  |TJ„  F  Si  <S2/C; 

•  if  T  F  Ah  =  M2  and  T  F  Mi  ^  Ci  and  T  F  M2  C2, 
then  [MiJc  =  LM2Jc  and  [rjn  F  Ci  =  C2  '■  [ATiJc. 

•  if  T  F  m'  :  M  Ci  and  T  F  M  C2  then  [rjn  F 
Ci  =  C2  :  [Mjc 
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Proof:  By  structural  induction  on  the  derivation  tree;  along  the 
process,  we  need  to  use  the  following  two  lemmas.  □ 

Lemma  4.4  Given  a  TMC  context  T,  suppose  T  h  m\  :  Mi,  let 
p  —  {x  m'i },  then 

•  if  T\  x:  Mi  F  M2  then  T;  x :  Mi  h  p{Mi )  =  M2. 

•  if  T;  x  :  Mi  F  L2  then  T;  x  :  Mi  F  p(L2)  =  L2. 

•  if  T:  x :  Mi  F  r  then  T]X\Mi  F  p(r)  =  T. 

•  if  T;x:Mi  F  m!  :  M2  then  T;x:Mi  F  p(m')  :  M2. 

Lemma  4.5  Given  a  TMC  context  T,  a  TMC  module  type  M,  an 
EMC  constructor  C,  and  an  EMC  kind  K,  if  [T;  x  :  M\n  F  C  : 
K  is  valid  in  EMC,  then  |_r_|„  F  C  :  K  is  valid  in  EMC  as  well. 


4.3  Comparison  with  the  stamp-based  semantics 

Compilers  for  the  strong-sum-based  calculus  [26,  36]  use  stamps 
to  support  type  generativity  and  abstract  types  (TMC  did  not  in¬ 
clude  these  features).  There  are  still  higher-order  module  programs 
that  are  supported  by  the  stamp-based  semantics  but  not  by  our 
type-theoretic  semantics.  Take  the  higher-order  functor  APPS  in 
Section  2.1  as  an  example  and  consider  applying  it  to  the  following 
functors: 

functor  G1  (X :  SIG)  =  X 

functor  G2  (X:  SIG)  =  struct  abstype  t  =  A 

with  val  x  =  A 
end 

end 

Both  applications  are  legal  under  the  stamp-based  semantics:  ap¬ 
plying  APPS  to  G1  results  in  a  module  whose  t  component  is 
equal  to  int  while  applying  APPS  to  G2  creates  a  module  whose 
t  component  is  a  new  abstract  type.  Under  our  scheme,  the  trans¬ 
parent  version  of  APPS  cannot  be  applied  to  G2;  the  abstract  ver¬ 
sion  works  for  both  but  it  does  not  propagate  sharing  when  applied 
to  Gl.  We  believe  this  lack  of  expressiveness  is  not  a  problem  in 
practice. 

5  Implementation 

A  module  system  will  not  be  practical  if  it  cannot  be  type-checked 
and  compiled  efficiently.  Our  EMC  calculus  can  be  checked  ef¬ 
ficiently  following  the  typing  rules  given  in  Section  3.2;  the  only 
nontrivial  aspect  of  the  elaboration  is  on  how  to  efficiently  test  the 
equivalence  between  two  arbitrary  EMC  types;  we  plan  to  use  the 
realization-based  approach  used  in  the  SML/NJ  compiler  [36]  to 
propagate  type  definitions. 

EMC  is  also  compatible  with  the  standard  type-directed  com¬ 
pilation  techniques  [18,  15,  35,  36].  Most  of  these  techniques 
are  developed  in  the  context  of  -like  polymorphic  lambda  cal¬ 
culi  [11,  33].  In  this  section,  we  define  a  Kernel  Module  Calculus 
(KMC)  and  show  how  to  translate  EMC  into  KMC  and  then  trans¬ 
late  KMC  into  an  F^. -like  Target  Calculus  (FTC). 

5.1  The  kernel  module  calculus  KMC 

Unlike  EMC  which  is  based  on  the  ML  syntax,  the  Kernel  Mod¬ 
ule  Calculus  (KMC)  uses  only  well-known  typing  constructs  such 


Module  expression  and  declaration: 

path  p  Xi  \  p.x  |  7Tv(p) 

mexp  m  ::=  p  \  {di, . .  ■ ,  dn}  \  \xi'- M.m  \  m(p) 
i  Art :  K.m.  \  m[C]  \  (u:  K  —  C,m\  M) 
\  let  d in  m 

mdec  d  Xi  —  m  \  ti  —  t  \  Vi—e 

Module  type  and  constructor: 

mtyp  M  {Di, . . .  ,  Dn}  \Tlxi'.  M.M' 

|  Vu:K.M  |  3u\K.M 
mtfd  D  Xi\  M  |  ti—r  |  Vi'.r 

mcon  C  u  |  7rt(p)  |  {-Fi,  •  •  • ,  Fn}  \  #x(C) 

|  \u :  K.C  |  Ci  [C2] 
mcfd  F  t  —  r\x  —  C 

mknd  K  ::=  {Qi,  ,  ■  ■ ,  Qn]  \  Ki  — >  K2 

mkfd  Q  t:Sl\x:K 

Core  language: 

ctyp  T  ::=  U  \  p.t  \  #t{C)  \  .  .  . 

cexp  e  Vi  \  p.v  |  . . . 

Elaboration  context: 

ctxt  T  £  |  T;  D  \  T;  u :  K 

Figure  19:  Syntax  of  the  kernel  module  calculus  KMC 


as  universal  quantification  (V),  existential  quantification  (3),  de¬ 
pendent  product  (II),  and  transparent  record  ({•})  to  model  higher- 
order  modules.  The  syntax  of  KMC  is  given  in  Figure  19.  The 
static  semantics  for  KMC  is  summarized  in  Figure  20.  The  com¬ 
plete  typing  rules  are  given  in  Figure  21  and  in  Appendix  E.  The 
EMC-to-KMC  translation  is  summarized  in  Figure  22  and  its  com¬ 
plete  definition  is  given  in  Appendix  F. 

Like  other  module  calculi  KMC  supports  a  form  of  simple  mod¬ 
ule  that  consists  of  an  ordered  list  of  type,  module,  and  value 
declarations — in  KMC  we  use  a  record  syntax  ({■})  rather  than 
str. .  .end  to  represent  such  simple  module.  Following  AMC 
and  EMC,  we  assume  that  each  declaration  in  KMC  simultane¬ 
ously  defines  an  internal  name  (e.g.,  i )  and  external  label  (e.g., 
x.  f,  v).  Given  a  module  record  m,  —  {di , . . .  ,dn}  (or  type 
M  =  {Di, .  . . ,  Dn},  declarations  (or  specifications)  defined  later 
can  refer  to  those  defined  earlier  using  the  internal  names. 

The  type  structure  of  KMC  resembles  a  typical  predicative 
polymorphic  A-calculus.  The  constructor  calculus  of  KMC  is  al¬ 
most  identical  to  that  of  EMC.  Module  kind  (mknd)  K  character¬ 
izes  module  constructor  (mcon)  C;  module  type  (mtyp)  M  mod¬ 
els  module  expressions  (mexp)  m.  An  elaboration  context  T  for 
KMC  contains  bindings  for  core  variables  ( v ),  core  type  variables 
(t),  module  variables  (x),  and  module  type  variables  (u). 

Opaque  modules  are  modeled  with  existential  types  [30]  and 
dot  notation  [5,  4].  Given  a  module  path  p  of  type  3ti :  K.M,  we 
use  7Tt  (p)  to  denote  p's  constructor  component  (which  should  have 
kind  K),  and  7rv(p)  to  denote  the  module  component  (which  should 
have  type  [7Tt(p) /u]M).  To  construct  an  opaque  module,  we  use 
the  module  expression  of  form  (u:  K  —  C,m:  M)  where  construc¬ 
tor  C  must  be  of  kind  K,  module  m  must  have  type  [C/u]M,  and 
the  resulting  module  has  type  3 u :  K.M. 
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hr  ok  xi :  m  e  r 
r  h  Xi  :  M 


T  h  p  :  {D  i. .  DkiX'i'.M,. . .}  p  —  {U  >->  p.t,x;  r-yp.x  \  U,Xi  €  Dom(Di. .  .Dk)} 

r  h  p.x'  :  p(M) 


Y  \~  p:3u:K.M  Y-xt:  M  h  m  :  M'  Y  h  m  :  ILe*  :  M.M'  Y  h  p  :  M 

T  h  7rv(p)  :  {u  7rt (p)}(M)  r  h  Xxi-.M.m  :  Hr,:: M.M'  T  h  m(p)  :  {xt  p}(M') 

T;  it :  AT  F  ra  :  M  T  \-  m  :\/u:K.M  T  h  C  :  K  p  —  {u  ^  C} 

T  h  A u-.K.m  :  \/u:K.M  r  h  m[C ]  :  p{M) 

hr  ok  T;  Dv, . . .  ;  Dk- 1  h  dk  :  Dk  k  —  1, . . .  ,n  Y  h  C  :  K  p  =  {u  C}  r  h  m:  p(M) 

T  h  {}  :  {}  r  h  {di,...,d„}  :  {Du...,Dn}  r  h  {u:K=C,m:M):3u:K.M 

TFM  rh  d:D  F\D  h  m  :  M  T  h  m  :  M  _ r  h  r _  V  h  e  :  r 

T  h  let  d  inra  :  M  V  h  ( Xi  —  m )  :  ( Xi'.M )  Y  h  (ti  —  r)  :  (U—t)  Y  h  ( Vi  —  e )  :  ( Vi'.r ) 

Figure  21:  Selected  typing  rules  for  KMC:  r  h  m  :  M  and  F  h  d  :  D 
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Y  F  SIS'  :p'^m‘ 

ctyp  equivalence  T  h  r  =  r' 
mcon  equivalence  T  h  C  =  C'  :  K 
mcfd  equivalence  T  h  F  =  F'  :  Q 

mtyp  equivalence  T  h  M  =  M' 
mtfd  equivalence  T  h  D  =  D' 

Figure  20:  Static  semantics  for  KMC:  a  summary 


KMC  supports  two  forms  of  parameterized  modules:  one  ab¬ 
stracted  over  module  values  (of  type  M);  another  over  module 
constructors  (of  kind  K).  A  module  function  Xxi  :  M.m  has  the 
dependent  product  type  Ila:, :  M.M' .  Dependent  product  is  neces¬ 
sary  because  we  use  dot  notation  to  access  opaque  modules  so  the 
return  type  of  a  function  might  refer  to  the  actual  argument.  Dot  no¬ 
tation  [4]  also  requires  that  functions  in  KMC  be  applied  to  module 
access  paths  only,  as  in  m(p).  This  is  not  a  problem  because  we 
can  always  use  let  to  introduce  local  declarations. 

Polymorphic  modules  in  KMC  are  parameterized  over  module 
constructors.  A  module  expression  Au  :  K.m  has  the  quantified 
type  Vm  :  K.M.  It  can  be  applied  to  constructor  C  if  C  has  kind  K, 
the  result  has  type  [C/u]M. 

Typing  module  identifier  and  functor  application  in  KMC  (see 
Figure  21)  is  much  simpler  than  those  in  AMC  and  EMC.  First, 
there  is  no  implicit  “strengthening”  when  we  access  a  module  iden¬ 
tifier.  Second,  KMC  does  not  have  any  form  of  subtyping:  to  type 
a  functor  application,  we  must  make  sure  that  the  type  of  the  actual 
argument  is  exactly  same  as  that  of  the  functor’s  formal  argument. 


Figure  22:  Translation  from  EMC  to  KMC:  a  summary 


Proof:  Expand  this  theorem  to  cover  module  declarations  and  core 
language  expressions;  the  generalized  version  of  this  theorem  can 
be  proved  by  structural  induction  on  the  derivation  tree.  □ 

The  translation  from  EMC  to  KMC  is  quite  simple.  Figure  22 
summarizes  the  main  translation  rules;  the  actual  definition  is  given 
in  Appendix  F.  Here,  ['Js  maps  EMC  contexts,  core  types,  core 
expressions,  module  constructors,  module  constructor  fields,  and 
access  paths  into  their  KMC  counterparts;  [5JS  translates  an  EMC 
signature  S  into  a  KMC  module  type  M:  translates  an  instan¬ 

tiated  signature  S  in  EMC  into  a  KMC  module  type  M ;  the  judge¬ 
ment  T  F  m  :  S  ^  m'  translates  an  EMC  module  expression  m 
of  signature  S  into  a  KMC  expression  m' .  We  also  use  the  judge¬ 
ment  T  h  S  i  S'  :  p  m!  to  coerce  a  KMC  path  p  with  type 
LSJs  into  a  KMC  expression  m!  with  type  L^Js,  assuming  S  and 
S'  are  both  instantiated  signatures  in  EMC.  We  prove  the  following 
type  preservation  theorem  for  the  EMC-to-KMC  translation: 

Theorem  5.2  Given  an  EMC  context  E,  we  have 

•  if  h  T  ok  is  a  valid  EMC  deduction  then  F  L^Jn  ok  is 
valid  in  KMC  as  well;  similarly, 

•  if  T  F  r  then  [rjs  F  [rjs; 

•  ij  T  F  e  :  r  then  |TJS  F  |ejs  :  LrJsi 

•  if  Y  F  C  :  K  then  |TJS  F  [CJS  :  [K]s; 

•  if  Y  F  F  :  Q  then  |TJS  F  [F\s  :  LQJS; 

•  if  Y  \~  H1  then  |TJS  F  [H^st 

•  if  Y  F  S1  then  |TJS  F  |SJ_U; 


Theorem  5.1  (unique  typing)  Given  a  KMC  context  T  suppose  m 
is  a  KMC  module  expression,  M  and  M'  are  KMC  module  types, 
if  T  F  m  :  M  and  T  F  m  :  M'  then  Y  F  M  =  M' . 
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kind 

K 

::=  H  |  Ai  — )•  K2  |  {h:K1,...,ln:Kn} 

con 

C 

::=  .  .  .  |  it  |  \u:K.C  |  Ci[C2] 

|  {h  :Ci, .  .  .  ,  ln  ‘-Cn}  1  #l(C)  |  Tttip) 

type 

M 

::=  T(C)\{h:Mu...,ln-Mn} 

|  Flx:M.M'  |  fw.K.M  |  3m:  A.M 

path 

P 

::=  x  \p.l  \  7TV(p) 

exp 

TO 

::=  . .  .  \p\  {h=m.i, . ..  ,l„  =  m„} 

\  Xx :  M.m  \m(p)  \  Au:  K.m  \  m[C ] 

\  (u:  K  —  C,  to:  M)  \  let  x  —  toi  in  TO2 

ctxt 

r 

::=  e  \V\x\  M  \F\u  \  K 

Figure  23:  Syntax  of  the  F^ 

-based  target  calculus  FTC 

context  formation 

F  T  ok 

constructor  formation 

T  F  C  ::  K 

type  formation 

r  f  m 

exp  formation 

r  F  TO  :  M 

constructor  equivalence 

r  F  C=C’  ::  K 

type  equivalence 

r  F  M  =  M‘ 

Figure  24:  Static  semantics  for  FTC:  a  summary 

•  if  T  S  then  |TJS  F  |SJ&; 

•  if  r  \~  p  :  S  then  |TJS  F  [pjs  : 

•  if  T  F  m  :  S  m!  then  [rjs  F  to'  :  [SJi,; 

•  if  T  F  r  =  r'  t/ten  [rjs  1“  [rjs  =  [r'Js; 

.  ijrhCEC':if  Lrj,  i-  LCJ,  =  LC"JS :  a,- 

•  if  r  F  F  =  F'  :  Q  fAe/t  F  |AJS  =  [A'_U  =  Qi 

Proof:  By  stmctural  induction  on  the  derivation  tree;  along  the 
process,  we  need  to  use  the  following  lemma.  □ 

Lemma  5.3  Given  an  EMC  context  T,  suppose  S  and  S'  are  two 
instantiated  EMC  signatures  and  p  is  a  KMC  access  path,  if  F  F 
S  <  S'  is  valid  in  EMC,  and  |_rjs  I-  p  :  [SJs  if  valid  in 
KMC,  and  T  F  S  4-  S'  :  p  to'  coerces  from  p  into  to',  then 
|TJS  b  to':  [S'\s  is  valid  in  KMC. 


5.2  The  F^-like  target  calculus  FTC 

KMC  can  then  be  easily  translated  into  an  -like  polymorphic  A- 
calculus  by  simply  dropping  all  the  type  components  in  the  KMC 
transparent  records  (after  we  inline  all  type  definitions  of  course) 
and  by  merging  the  module  constructor  and  the  core  type  expres¬ 
sions.  The  result  is  an  F^ -based  Target  Calculus  (FTC)  as  defined 
in  Figure  23.  FTC  is  essentially  the  standard  predicative  variant  of 
the  F^  calculus  extended  with  dot  notation  (i.e.,  7tt  ip)  and  7r v(p)), 
existential  types  (3),  and  dependent  products  (II).  Figure  24  and 
Appendix  G  gives  the  typing  rules  for  FTC.  The  translation  from 
KMC  to  FTC  is  omitted  since  it  is  rather  trivial. 

The  fact  that  all  the  module  languages  given  in  this  paper  can 
be  compiled  into  an  Fw -based  calculus  is  important  because  im¬ 
mediately  all  important  type-based  compilation  techniques  [18,  35, 
15,  29,  39]  become  applicable  to  these  module  languages  as  well. 
In  a  previous  paper  [36],  we  presented  a  type-preserving  transla¬ 
tion  from  the  MacQueen-Tofte  higher-order  modules  [25]  into  an 
F^  -based  calculus,  however,  that  algorithm  turns  all  abstract  types 


into  concrete  ones;  this  makes  it  hard  to  reason  about  type-directed 
operations  on  values  with  abstract  types.  The  translation  given  in 
this  paper  rightly  maps  all  opaque  modules  into  abstract  types,  so 
two  different  types  in  the  source  calculus  would  not  be  considered 
as  equivalent  in  the  target  calculus. 

6  Related  Work 

Module  systems  have  been  an  active  research  area  in  the  past 
decade.  The  ML  module  system  was  first  proposed  by  Mac- 
Queen  [24]  and  later  incorporated  into  Standard  ML  [27].  Harper 
and  Mitchell  [13]  show  that  the  SML’90  module  language  can 
be  translated  into  a  typed  lambda  calculus  (XML)  with  depen¬ 
dent  types.  Together  with  Moggi,  they  later  show  that  even  in 
the  presence  of  dependent  types,  type-checking  of  XML  is  still  de¬ 
cidable  [14],  thanks  to  the  phase-distinction  property  of  ML-style 
modules.  The  SML'90  module  language,  however,  contains  sev¬ 
eral  major  problems;  for  example,  type  abbreviations  are  not  al¬ 
lowed  in  signatures,  opaque  signature  matching  is  not  supported, 
and  modules  are  first-order  only.  These  problems  were  heavily 
researched  [12,  19,  20,  23,  38,  26,  17]  and  mostly  resolved  in 
SML'97  [28].  The  main  remaining  issue  is  to  design  a  higher-order 
module  calculus  that  satisfies  all  of  the  properties  mentioned  in  the 
beginning  of  this  paper  (see  Section  1 .2). 

Supporting  higher-order  functors  with  fully  syntactic  signatures 
turns  out  to  be  a  very  hard  problem.  In  addition  to  the  work  dis¬ 
cussed  at  the  beginning  of  Section  1.2,  Biswas  [2]  gives  a  seman¬ 
tics  for  the  MacQueen-Tofte  modules  based  on  simple  polymor¬ 
phic  types.  His  formulation  differs  from  the  phase-splitting  seman¬ 
tics  [14,  36]  in  that  he  does  not  treat  functors  as  higher-order  type 
constructors.  As  a  result,  his  scheme  requires  encoding  certain  type 
components  of  kind  f l  using  higher-order  types — this  significantly 
complicates  the  type-checking  algorithm.  Russo  [34]  's  recent  work 
is  an  extension  of  Biswas’s  semantics  to  support  opaque  modules; 
he  uses  the  existentials  to  model  type  generativity,  but  his  type- 
checking  algorithm  still  relies  on  the  use  of  higher-order  matching 
as  in  Biswas  [2]. 

Our  kernel  module  calculus  (KMC)  is  partly  inspired  by  the 
work  on  parameterized  signatures  of  Mark  Jones  [17].  Both  of  our 
approaches  use  higher-order  type  constructors  to  propagate  sharing 
information.  However,  our  notion  of  signatures  differ  from  his  in 
that  we  allow  type  components  inside  the  module  record.  In  fact, 
our  module  record  is  a  transparent  sum  and  it  can  contain  an  or¬ 
dered  list  of  type,  value,  and  module  declarations;  parameterized 
signatures  in  Jones  [17]  only  allow  value  components. 

7  Conclusions 

A  long-standing  open  problem  on  ML-style  module  systems  is  to 
design  a  calculus  that  supports  both  fully  transparent  higher-order 
functors  and  fully  syntactic  signatures.  In  his  Ph.D.  thesis  [23, 
page  310]  Mark  Lillibridge  made  the  following  assessment  on  the 
difficulty  of  this  problem: 

In  principle  it  should  be  possible  to  build  a  system  with 
a  rich  enough  type  system  so  that  both  separate  compi¬ 
lation  and  full  transparency  can  be  achieved  at  the  same 
time.  Because  separate  compilation  requires  that  all  in¬ 
formation  needed  for  type  checking  the  uses  of  a  functor 
be  expressible  in  that  functor's  interface,  this  goal  will 
require  functor  interfaces  to  (optionally)  contain  an  ide¬ 
alized  copy  of  the  code  for  the  functor  whose  behavior 


14 


they  specify,  I  expect  such  a  system  to  be  highly  com¬ 
plicated  and  hard  to  reason  about. 

This  paper  shows  that  fully  transparent  higher-order  functors  can 
also  have  simple  type-theoretic  semantics,  so  they  can  be  added  to 
ML-like  languages  while  still  supporting  true  separate  compilation. 
Our  solution  only  involves  a  conservative  extension  over  the  system 
based  on  translucent  sums  and  manifest  types:  modules  that  do 
not  use  transparent  higher  order  functors  can  still  have  the  same 
signature  as  before. 

The  new  insight  on  full  transparency  also  improves  our  un¬ 
derstanding  about  other  module  constructs.  Harper  et  al  [14]  and 
Shao  [36]  have  given  a  type-preserving  translation  from  ML-like 
module  languages  to  polymorphic  A-calculus  Fw.  Their  phase¬ 
splitting  translations,  however,  do  not  handle  opaque  modules 
well — abstract  types  must  be  made  concrete  during  the  translation. 
Our  new  translation  rightly  turns  opaque  modules  and  abstract 
types  into  simple  existential  types. 

Higher-order  functors  and  fully  syntactic  signatures  allow  us  to 
accurately  express  the  linking  process  of  ML  module  programs  in¬ 
side  the  module  language  itself.  In  the  future  we  plan  to  use  the 
module  calculus  presented  in  this  paper  to  formalize  the  configura¬ 
tion  language  used  in  the  SML/NJ  Compilation  Manager  [3].  We 
also  plan  to  extend  our  module  calculus  to  support  dynamic  link¬ 
ing  [22]  and  mutually  recursive  compilation  units  [9,  8]. 
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A  Static  Semantics  for  AMC 


A.4  sig  formation:  r  I-  S 

b  r  ok 

r  b  sig  end 

T;  Hi; . . . ;  Hk-i  b  Hk  k—l,...,n 

r  b  sig  H\, . .  ,.,  H„  end 

r  h  S  Y;Xj:  S  b  S' 

T  b  f  sig(xi :  S)  :>  S' 


(7) 

(8) 

(9) 


A.5  spec  formation:  r  b  H 


r  b 

5  Xi  ^  dom{Y) 

(10) 

Y  b  xuS 

b  r 

ok  ti  dom(Y) 

(ii) 

rbtj 

r  b 

r  ti  ^  dom(Y) 

Y  b  ti  —  T 

(12) 

Y  b  r 

(13) 

r  b  Vi'.r 

This  appendix  gives  the  rest  of  the  typing  rules  for  the  abstract 
module  calculus  AMC.  The  formation  rules  for  module  expressions 
(T  b  m  :  S)  and  module  declarations  (T  b  d  :  H)  are  given 
in  Figure  5  in  Section  3.1.  The  subsumption  rules  for  signatures 
(T  b  S  <  S'),  and  specifications  (T  b  H  <  H')  are  given  in 
Figure  6  in  Section  3.1. 


A.6  ctyp  equivalence:  r  b  r  =  t 

Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 


b  T  ok  U  —  t€  r 
r  b  u  =  t 


(14) 


A.1  ctxt  formation:  b  r  ok 


b  e  ok 

r  b  h 
b  Y;H  ok 


(1) 

(2) 


T  bp:  sig  Hi,. .  .,Hk,.  ■  • ,  Hn  end 
p  —  {ti  p.t,  Xi  p.x  |  ti,Xi  6  Dom(X)} 
where  X  —  Hi,...,Hk-i  and  Hk  —  (ti  —  r)  (15) 

T  b  p.t’  =  p(r) 


B  Static  Semantics  for  EMC 


A.2  ctyp  formation:  r  b  r 


b  F  ok  U  6  r  or  t.j  —  t  6  r 
rbtj 


(3) 


r  b  p  :  S  S—  {■ . . ,  ti  —  t,  .  . .}  or  {.  . .  ,ti, . . .} 
r  b  p.t 


(4) 


This  appendix  gives  the  rest  of  the  typing  rules  for  the  external 
module  calculus  EMC.  The  formation  rules  for  module  construc¬ 
tors  (r  b  C  :  K),  and  module  constructor  fields  (T  b  F  :  Q)  are 
given  in  Figure  9  in  Section  3.2.  The  formation  rules  for  module 
expressions  (T  b  m  :  S)  and  module  declarations  (T  b  d  :  H) 
are  given  in  Figure  12  in  Section  3.2.  The  subsumption  rules  for 
signatures  (T  b  S  <  S')  and  specifications  (T  b  H  <  H')  are 
given  in  Figure  13  in  Section  3.2. 


A.3  cexp  formation:  Y  b  e  :  r 


B.1  ctxt  formation:  b  Y  ok 


b  T  ok  Vi  :t  €  T 
T  b  Vi  :  t 

r  bp:  sig  Hi,. .  .,Hk,. . . ,  Hn  end 
p  —  {ti  i — ^  p.t,  Xi  >-»• p.x  |  ti,Xi  €  Dom(X)} 
where  X  —  Hi,...,Hk-i  and  Hk  —Vi'.r 

T  b  p.v  :  p(r) 


b  e  ok 

(16) 

Y  b  H 
b  Y;H  ok 

(17) 

b  r  ok  u  $  dom(Y) 

(18) 

b  Y;u:K  ok 
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S/C  is  a  shorthand  of  S/(C  :  knd(S));  0 

Po=P  Hj  /  (C :  K) ;  Pj-i  =>  Hj ;  pj  j-l,...,n 
(sig  Hi  . . .  Hn  end )/(C :  K );  p  =>  sig  H[  . . .  H'n  end 

(f  sig(xj :  S)  :>  S')/(C :  A);  p  =>  f  sig(xi :  S)  :>  S' 

K  =  K'^K"  S' /(C[xi]:K")-,p=>  S" 

(fsig (xi  :  S) :  S') / (C :  K);  p  =>•  f  sig(x» :  S)  :>  S" 


component.  To  make  this  work,  we  need  revise  the  EMC  signature¬ 
strengthening  operation  so  that  all  references  to  xi  are  substituted 
by  equivalent  constructors  that  have  no  such  references.  The  new 
routine  is  shown  in  Figure  25  where  S/C  is  now  implemented  as 
S/(C  :  knd(5));  0  with  0  denoting  the  identity  substitution.  The 
auxiliary  procedures  S/(C  :  A');  p  =>  S'  means  that  instantiating 
S  by  constructor  C  of  kind  K  under  substitution  p  yields  signature 
S',  and  H/(C :  AT);  p  =>  H';p'  means  that  strengthening  specifi¬ 
cation  H  by  constructor  C  of  kind  AT  under  substitution  p  yields 
specification  H'  and  new  substitution  p' . 


S/(#x(C):K);p=>  S'  p'  =  pV{xi>-*  #x(C)} 
(Xi :  S)/(C:{.  . . x  :  K, .  .  .});  p  =>  (xt :  S')]  p' 

(ti)/(C :{...,  t :  ...});  p  =>  (U  =  #t(C));  p 

(U  —  t)/(C:K)',  p  =>  (U=p(T))\p 

C iH\t)/{C\K)\p=>  ( Vi:p(r));p 

Figure  25:  Signature  strengthening  in  EMC  (revised) 


B.2  ctyp  formation:  T  I-  r 


r  h  s  r-xi  -.  s  t-  s' 

T  h  f  sig(a;j :  S)  :  S' 

r  I-  S  T-,Xi:  S  F  S' 
r  h  fsig(a;i:5)  :>  S' 

B.5  spec  formation:  T  h  H 

T  h  S  Xi  $  dom(T) 
T  h  Xi-.S 


(26) 


(27) 


(28) 


h  T  ok  ti  €T  or  ti  —  T  €  T 
T  \~  ti 


(19) 


T  I -  p  :  S  S—{. . ,  ,ti—T, . . .}  or  {.  . .  ,ti,  ■  ■  ■} 

r  f  p.t 

r  h  C:  K  A={...,f:n,...} 

r  f  #t(c) 


(20) 


(21) 


B.3  cexp  formation:  T  h  e  :  r 


i-  r  ok  Vi'.r  e  r 

T  \-  Vi  :  t 


(22) 


T  I-  p  :  sig  Hi,..  .,Hk,.  ■  ■ ,  Hn  end 
p  —  {ti  ^  p.t,Xi  i->  p.x  |  ti,Xi  €  Dom(X)} 
where  X  —  Hi,...,Hk-i  and  Hu  —  Vi'.r  (^3) 

T  h  p.v  :  p(r) 

B.4  sig  formation:  T  I-  S 


FT  ok 

T  F  sig  end 

Vxi  e  Dom(iTi  . . .  Hk-i),  xi  is  not  free  in  Hk 
T;  Hi; . . .  ;  Hk-i  F  Hk  k  —  l,...,n 

T  F  sig  Hi,...,Hn  end 

The  side  condition  on  Hk  in  this  rule  is  not  absolutely  necessary. 
If  we  remove  this  requirement,  we  essentially  allow  free  flexroot 
references  such  as  xi  even  when  Xi  is  a  locally  declared  structure 


(24) 

(25) 


F  T  ok  ti  dom{ T) 

r  f  u 


(29) 


T  F  t  ti  ^  dom(Y) 

r  f  ti—T 


(30) 


r  f  t 

T  F  Vi'.r 


(31) 


B.6  ctyp  equivalence:  T  F  r  =  r' 

Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 


f  r  ok  ti—T  e  r 
T  F  ij  e  r 


(32) 


T  F  p  :  sig  Hi,. .  -,Hk,-  ■  ■  ,Hn  end 
p  —  {ti  ^  p.t,  Xi  i-t  p.x  |  ti,  Xi  e  Dom(X)} 
where  X  —  Hi,...,Hk-i  and  Hk  —  t'i  —  r  (33) 

T  F  p.t'  =  p(r ) 


T  F  C  t  =  r,...} 

T  F  #t(C)  =  r 


(34) 


B.7  mcon  equivalence:  T  F  C  =  C' :  K 

Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 


p  =  {u^>C'}  r\~C':K  T;u:K  F  C  :  K' 
T  F  (\u:  K.C)[C']  =  p(C )  :  K' 
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(35) 


_ r  b  C  :  A" _ 

r  h  (A u:K.C[u})  ~  C  :  K  —¥  K1 


(36) 


C.3  ctme  formation:  r  I -  ml  ■.  M 


r  b  C  { . r  — { . v.K'....} 

T  1-  #x(C)  =  C’  :K' 


(37) 


h  T  ok  x:L  €  T  L/x  =>  M 

rhi:M 


K  =  {Qu---,Qn)  T  \-  C  \  K 
r  (-  Fj  =  (l  =  #l(C))  :  Qj  j  -  1,.  .  .,n;  l  -  x,t]  (38) 

r  b  {Fi,...,Fn}  =  C:K 

B.8  mcfd  equivalence:  r  I-  F  =  F'  :  Q 


v  ok  lifer 
r  x:  M 


r  h  e'  :t 
r  I-  iv(e')  :  v(r) 


(46) 


(47) 


(48) 


Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 

B.9  mknd  subsumption:  h  K  <  A" 

Rules  for  reflexivity  and  transitivity  are  omitted. 


a  :  {1,. .  ,,m}  ^  {1  ,•••,«} 

—  Qj  j  —  1,  .  .  .  ,  TYl 

I-  {Ql,  ,  .  ,  ,Qn}  <  {Qi,  ■  ■  ■  ,  Q'm] 

K[  <Ki  I -  K2<K'2 
Ai  ->  Ki  <  K[  — >  K-2 

B.10  mkfd  subsumption:  I -  Q  <Q' 

Rules  for  reflexivity  and  transitivity  are  omitted. 


h  K  <  I\ 
h  x:K<  x :  K' 


C  Static  Semantics  for  TMC 

This  appendix  gives  the  complete  typing  rules  for  the  transparent 
module  calculus  TMC. 


(39) 


(40) 


(41) 


n-r 


r  b  h(t)  : 

EQ(t) 

r  b  to'  :  Ex 

:  Mi.  M2 

T  b  tti(to')  :  Ml 

T  b  to'  :  Ex:  M1.M2 

II 

l 

'-r-' 

T  b  7T2 (to') 

:  p(M'i) 

T  b  to'i  :  Ml  T;  x :  Ml  b  to2  :  M2 

T  b  ^x  —  to'i,  to2) 

:  Sx :  Mi .  M2 

T;  x  :  L  b  to'  :  M 

T  b  A x\L.m! 

:  Ux:L.M 

T  b  to'i  :  IIx:Ti.M2 
r  b  Mi  <  Li 

Tbm'j:  Mi 
p  —  {x  TO2} 

r  b  to'i  (to2  ) 

:  p(M2) 

b  mi  :  Ml  T  b  M2 

T;  x :  Mi  b  m'2  : 

T  b  let  x  —  m'i  in  m'2  :  M2 

ctce  formation:  T  b  e' 

:  r 

(49) 

(50) 

(51) 

(52) 

(53) 

(54) 

(55) 


C.1  ctxt  formation:  I -  T  ok 

I -  e  ok 

r  I -  M  x$  dom( T) 
I-  F;x:M  ok 

T  h  L  x  ^  dom( T) 
I-  T]x\L  ok 

C.2  ctyp  formation:  T  I-  r 

T  h  m'  :  EQ(r) 

T  I-  7 rt(m') 


T  h  m!  :  V(t) 

T  h  7 rv(m')  :  r 

(42) 

C.5  mtyp  formation:  T  \-  M  and  T  I-  L 

(43)  Rules  for  module  types  of  form  M : 

T  h  r 

(44)  T  b  v(t) 

T  b  r 
T  b  EQ(t) 

(45)  T;  x :  Mi  b  M2 

T  b  Sx:Mi.M2 


(56) 


(57) 


(58) 


(59) 
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F;x:L  b  M 
r  b  Tlx-.L.M 


(60) 


r  bp:  Ex:  Mi. M2  p  =  {l4  m  (p)j 
T  b  772 (p)  :  p(M2) 


(74) 


Rules  for  module  types  of  form  L: 

r  b  t 
r  b  v(t) 

r  b  TYP 


(61) 

(62) 


T  b  e  :  r 
T  b  t„(e)  :  V(t) 


(75) 


T  b  r 

T  b  it(r)  :  EQ(r) 


(76) 


r;i:Lj  b  L2 
r  b  F,x:L\.L2 


(63) 


r  b  rwi  :  Mi  T;  x :  Mi  b  m2  :  M2 
T  b  (x—m\,m2)  :  T,x:Mi.M2 


(77) 


T;x:Li  b  L2 

r  b  IIx : L\.L2 


(64) 


L  =  5  T;  x :  L  b  m  :  M 
T  b  \x:S.m:Hx:L.M 


(78) 


C.6  cexp  formation:  r  b  e  :  r 

r  bp:  v(r) 

T  b  77„  (p)  :  r 

C.7  ctsp  formation:  r  b  p 


(65) 


r  b  pi  :  IIx:Ei.M2  F  p2  :  Mi 

r  b  Ml  <  Li  p  —  {x  1-4  p2}  (79) 

r  b  pi(p2)  :  p(M2) 

r  b  mi  :  Mi  r  b  M2  T;  x  :  Mi  b  m2  :  M2 
T  b  let  x—mi  in  m2  :  M2 


rbp:  EQ(t) 

r  b  Mp) 


C.10  ctyp equivalence:  r  b  71=75 

(66) 

Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 


C.8  sig  formation:  r  b  S 

Adding  bindings  such  as  “x  :  S”  to  the  context  T  is  fine  because 
each  signature  S  is  also  a  module  type  of  form  L. 


r  b  m'  :  EQ(t) 
T  b  77 t{m')  =  t 


(81) 


rbp 

r  1-  v(p) 

r  b  TYP 

T;x:5i  b  52 
T  b  Ex:  Si .52 

r;x:5i  b  s2 
r  b  nx:5i.52 

C.9  mexp formation:  r  b 

b  T  ok  x:LeT  L/x  =>  M 
r  b  x  :  M 

b  T  ok  x:  M  6  T 
r  b  x  :  M 


(67) 

(68) 

(69) 

(70) 


Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 

C.12  mtyp  subsumption:  r  b  M  <L 


r  b  EQ(l-)  <  TYP 


r  b  Tl=T2 

r  b  v(ri)  <  v(r2) 


(82) 


(83) 


(71) 


(72) 


r  b  Mi  <  Li  T;  x :  Mi  b  M2  <  L2 
r  b  Ex:  M1.M2  <  Ex : L\.L2 


r;x:Ei  b  M2<L2 
r  b  Tlx:  Li.M2  <  IIx :  Li.L2 


(84) 


(85) 


T  b  p:  Ex:  Mi. M2 

T  b  77i  (p)  :  Mi 
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C.13  mtyp  strengthening:  Ljm'  =>  M 


D.5  path-to-path  translation:  [p\n 


V(r)/m  =>  V(r) 

TYP jm  => 

L\ /m(rn!)  =>  Mi  £2/772 (m')  M2 
Y2X-.L1.L2jm!  =>•  Lx-.  Mi. M2 


L2/m'(x)  =>  M2 

-  ,  ,  -  (89) 

Hx-.L1.L2/rn  =>  Hx-.L1.M2 

D  Translation  from  TMC  to  EMC 

This  appendix  gives  the  complete  translation  algorithm  front  TMC 
to  EMC;  the  main  translation  is  denoted  by  [■  J  n ;  the  auxiliary  func¬ 
tion  that  maps  from  M  or  L  to  EMC  kind  K  is  denoted  by 
the  translation  front  M  or  L  to  EMC  signature  S  is  represented  as 
T  b  M  S  and  T  b  L  5;  the  translation  from  M  or 
m'  to  EMC  constructor  C  is  represented  as  T  b  M  C  and 
T  b  m!  \  M  C;  the  translation  front  r  to  EMC  core  type 
expression  t'  is  represented  by  T  b  r  t! 

Throughout  the  translation,  we  use  fst  and  snd  to  denote 
EMC  module  labels,  ops  for  an  EMC  value  label,  and  typ  for 
an  EMC  type  label.  A  TMC  module  identifier  x  is  translated  to  an 
EMC  identifier  fst*  where  fst  is  an  EMC  module  label  and  x 
denotes  the  stamp. 

D.1  ctxt-to-ctxt  translation:  |TJ„  >-¥  T 

[s]n  —  £ 

r  b  l^s 

|_T :  x '.  L\  n  —  |TJ„;  f  s  t  *  :  S' 

r  b  s 

[T;  x '.  M J  n  —  |_T J  n  ]  f  s  t  *  :  S 

D.2  ctsp-to-ctyp  translation:  [/. 

kt(p)Jn  =  [pjn-typ 


[772  (p)  Jr 


[pjn.fst 

[pjn-snd 


D.6  mexp-to-mexp  translation:  [m\n  >-*■  m 


[Pir 
\_iv  (c)  J  r 
Ut(p)Jr 
[{x  =  TWi,ra2)Jr 


[A* :  S.m\r 

[pi(p2)]r 

[let  x  —  mi  in  7712]* 


str  ops,;  =  LeJ„  end 
str  typ,;  =  La*J  n  end 

str  fst*  =  [raijn? 
snd*/  =  Lm2J„ 

end 

fct(f st* : L5J„)LmJ„ 

blj«(b2jn) 

let  fst*  =  Lmijn  in  |m2jr 


D.7  ctyp-to-ctyp  translation:  T  b  T  T ' 

The  translation  of  a  core  type  in  TMC  is  based  on  its  formation 
rules.  Given  a  well-formed  core  type  r  in  context  T,  it  is  translated 
into  EMC  type  t'  if  and  only  if  the  judgement  T  b  r  r  is 
valid. 

T  b  w1  :  M  m  C 
T  b  77t(m')  #typ(C) 

D.8  mtyp-to-sig  translation:  T  b  M  ^  S 

T  b  r  t' 

T  b  v(r)  sig  opSj :  t'  end 

T  b  r  t' 

T  b  EQ(r)  sig  typj  —  t  end 

T  b  Mi  ^  Si  T-,x:Mi  \~  M2^  S2 


T  b  Yx-.M1.M2 


sig  fst*:Si 


snd*/  :  S2  end 


Tbfi-v/Si  T;i:Li  bM2M& 

T  b  Hx  \  L1.M2  fsig(fst*:Si)  :  S2 


D.3  sig-to-sig  translation:  [SJ„  >->  S 

Lv(p)Jn  =  sig  opsj :  L/tJn  end 

[TYPjn  =  sig  typj  end 

[£x:Si.S2Jn  =  sig  fst*. :  |_SiJn; 

snd*/  :  |_S2  J  n 

end 

|IIx:Si.S2jn  =  fsig(fst*  :  [5iJ„)  : 

D.4  cexp-to-cexp  translation:  |ej„>-*e 
[77*  (p)  J n  =  [pjn-ops 


D.9  mtyp-to-sig  translation:  T  b  ImS 

_ r  b  t  t' _ 

T  b  V(r)  sig  opSj :  r'  end 
T  b  TYP  sig  typj  end 


T  b  Li-^Si  T-,x:Li  b  £2  S2 


r  b  Yx-.L1.L2  sig 


fst* : Si 
snd*/  :  S2  end 


T  b  £i-^Si  T;x:£i  b  £2  S2 
T  b  IIx  -.Li.L2'^>  fsig(fst*:Si)  :  S2 
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D.10  mtyp-to-kind  translation:  [M]c  >-*  K 


T;x:L  b  to'  :  M  C  p  —  {fstx  >  tt} 
T  b  A x\L.m'  :  IIx:L.M  Alt:  \_L\c.p(C) 


Lv(r)Jc 
LEQ(r)Jc 
[Ex:  M\.  M2  Jo 
[IIx :  L.MJc 


{} 

{typ  :  n} 

{fst  :  [Mi\c,  snd  :  [M2JC} 
\L\C  -4  LMJc 


r  b  Tfi'i  :  IIx :  L1.M2  Ci  r  b  to2  :  Mi  C2 
T  b  Mi  <  Li  p  —  {x  1-4  m'2} 

T  b  to[(to2)  :  p{M2)  Ci[C2] 


D.1 1  mtyp-to-kind  translation:  [LJC  i->  if 


Lv(r)Je 
[typJc 
[Ex :  Li.L2\c 
[IIx :  Li.L2Jc 


{} 

{typ  :  fi} 

{fst  :  LLiJc,  snd  :  [L2JC} 

[Li\c  — >  L-^2 J c 


D.1 2  mtyp-to-mcon  translation:  r  b  M  C 

M-shaped  TMC  module  types  can  be  translated  into  EMC  module 
constructors.  The  translation  is  based  on  the  type  formation  rules 
for  M. 


r  b  tn'i  :  Ml  r  b  M2  T;  x:  Mi  b  to'2  :  M2  C 
T  b  let  x  —  m'i  in  m'2  :  M2  C 


E  Static  Semantics  for  KMC 

This  appendix  gives  the  rest  of  the  typing  rules  for  the  kernel  mod¬ 
ule  calculus  KMC.  The  formation  rules  for  module  expressions 
(T  b  m  :  M)  and  module  declarations  (T  b  d  :  D)  are  given  in 
Figure  21  in  Section  5.1. 

E.1  ctxt  formation:  b  V  ok 


r  b  v(t)  ^  {} 

_ T  b  r  t' _ 

T  b  EQ(t)  {typin'} 

r  b  Mi-^Ci  T;  x :  Mi  b  M2  C2 

T  b  Ex : Mi. M2  {f st  =  Ci,  snd  =  C2} 


T;x:Li  b  M2  C2  p  —  {fstx  >-4  it} 
T  b  nx:Li.M2  \u:  [Li\c.p{C2) 


b  £  ofc 

fbD 
b  T;23  ofc 

b  T  ofc  tt  ^  dom(r) 

b  T;  u :  K  ok 

E.2  ctyp  formation:  T  b  r 


D.1 3  ctme-to-mcon  translation:  T  b  m'  :  M  ^  C 

All  TMC  module  expressions  (m1)  embedded  inside  the  core  types 
can  be  translated  into  EMC  module  constructors.  The  translation  is 
based  on  the  formation  rules  for  m' . 

b  T  ok  x\L€T  L/x  =>  M 
T  b  x  :  M  fstx 

b  r  ok  x-.M  e  r  r  b  m  c 
r  b  x-.m^c 


b  r  ok  u  —t  e  r 
r  b  t 

T  b  p:  M  M  — 

r  b  p.t 

r  b  C:  K  iC={...,f:K,...} 

r  b  #t(C) 


(90) 

(91) 

(92) 

(93) 

(94) 

(95) 


r  b  e!  :  T 

r  b  iv(e!)  :  v(r )  {} 

T  b  r  T1 

r  b  tt(r)  :  EQ(t)  {typ  =  r'} 

r  b  to'  :  Sx:Mi.M2^Cr 
T  b  7ri (to')  :  Mi  #fst(C) 

T  b  to'  :  Ex:Mi.M2  C  p  =  {xH  7Ti(to')} 

T  b  7t2(to')  :  p(M2)  #snd(C) 

T  b  to'i  :  Mi  Ci  T;  x :  Mi  b  to'2  :  M2  C2 
T  b  (x  =  to'i,to2)  :  Ex:  Mi.M2  {fst  =  Ci,  snd=C2} 


E.3  cexp  formation:  T  b  e  :  r 


b  r  ok  Vi  '.r  e  r 

T  b  Vi  :  T 


(96) 


T  b  p  :  {Di, .  .  ■  •  ,  -D»} 

p  —  {ti  p.t,  Xj  hA  p.x  |  ti,Xi  e  Dom(X)} 
where  X  —  Di,. .  -,Dk-i  and  Dk  —vi:t  (97) 

T  b  p.t;  :  p(r) 
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E.4  mcon  formation:  Y  \-  C  .  K 


E.8  ctyp  equivalence:  r  h  t  =  r' 


b  r  ok  u-.K  e  r 
r  h  u  :  K 


Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
(98)  omitted. 


r  I -  p:  3 u:K.M 

T  I-  7rt(p)  :  K 


(99) 


h  T  ok  U  —  t€Y 

r  i-  u  =  t 


(114) 


r  I -  Fj  :Qj  j  =  1, . .  . ,  n 
r  b  {F1,...,Fn}  :  {Qi,...,Qn} 


V  \-  C  :  K'  K'  —  x  :  K, .. .} 
r  b  #x(C )  :  K 


(100) 

(101) 


r  I-  p  :  {Di, .  .  . ,Dk ,  A,  } 

p  —  {ti  p.t,  p.x  |  ti,Xi  €  Dom(X)} 
where  X  —Di,. .  .,Dk-i  and  Dk  —  t[—T  (115) 

T  h  p.t'  =  p(r) 


T-u-.K  \-  C-.K' 
r  I-  A U-.K.C  ■.  K' 


(102) 


r  h  cs 

r  h  #f(C)  =  r 


(116) 


r  h  Ci :  jc  -» it:'  r  h  c2 :  k 
r  t-  Ci[c2] :  if' 

E.5  mcfield  formation:  Y  \-  F  ■.  Q 


(103)  E.9  mcon  equivalence:  r  I-  C  =  C' :  K 

Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 


T  h  r 

r  I-  t  —  T  :  (t :  fi) 


(104) 


p  =  {MH>C'}  n-  C'  :  if  Y;u:K  h  C  ■  K' 
r  h  (A u:K.C)[C']  ~  p(C )  :  K' 


(117) 


Y  C  :  K 
Y  h  x  —  C  :  ( x:K ) 


E.6  mtyp  formation:  r  I-  M 


h  r  ok 
Y  h  {} 


(105) 


(106) 


_ r  h  C  :  K' _ 

r  h  (A u:K.C[u})  =  C  -K-^K' 


(118) 


r  h  C  { . r  — . r :  l\'. . . . } 

r  I-  #x(C)  =  C'  :K' 


(119) 


rjOi  I-  {D2,...,Dn} 
Y  b  {Du...,Dn} 

Y;x:M  b  M' 

Y  b  Ylxi-.M.M' 


K  —  {Qi,  ■  ■  ■ ,  Qn}  Y  b  C:K 

(107)  r  b  Fj  =  ( z  =  #z(C ))  :  Qj  j  -  1,.  .  ,,w  (120) 

r  b  {Fu...,Fn}  =  C:K 


(108) 


E.10  mcfd  equivalence:  Y  b  F  =  F'  :  Q 


Y;u:K  b  M 
Y  b  Mu-.K.M 


Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
(109)  omitted. 


Y;u:K  b  M 
Y  b  3u:  K.M 


E.7  mtfd  formation:  Y  b  D 


E.11  mtyp  equivalence:  r  b  M  =  M' 

(110) 

Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 


r  b  M  Xi  $  dom(Y) 
r  b  Xi-.M 


(111) 


Y-,  Di] . . .  ■  Dk-i  b  Dk  —  D'k 

r  b  {D1,...,Dn}  =  {D'1,...,D'n} 


(121) 


r  b  t  U  ^  dom(Y) 
Y  b  ti  —  T 

Y  b  r 

r  b  Vi'.r 


(112) 

(113) 


E.12  mtfd  equivalence:  r  b  D  =  D' 

Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 


22 


F  Translation  from  EMC  to  KMC 

This  appendix  gives  a  type-preserving  translation  algorithm  from 
EMC  to  KMC.  To  make  the  presentation  easier,  we  first  modify  the 
EMC  syntax  to  distinguish  different  uses  of  module  access  paths: 

path  p  xt  \  p.x 

mexp  m  (p)  |  str  di, , .  . ,  d„  end 

i  f  ct(xi :  S)m  |  pi  (p2) 

|  (p  :>  S)  |  let  d  in  m 

Here,  we  use  (p)  to  denote  the  places  where  a  module  path  p  is 
used  as  a  stand-alone  module  expression.  We  then  separate  the 
formation  rules  for  module  paths  from  regular  module  expressions: 

path  formation  T  b  p  :  S 
mexp  formation  T  b  m  :  S 

As  a  result  of  this  reorganization,  we  add  the  following  rule  to  the 
mexp  formation: 

T  b  p:  5 
T  h  (p):S 

The  EMC-to-KMC  translation  is  denoted  as  [-Js.  A  special  auxil¬ 
iary  function  that  translates  EMC  signature  S  into  KMC  existential 
module  type  M  is  denoted  as 

F.1  ctxt-to-ctxt  translation:  [TJS  ha  T 

L£Js  =  £ 

|_r ;  =  t  J  s  =  LrJs;ti  =  Ms 

|_T;  Uj :  rjs  =  s ;  Vi :  |rj  s 

[T;  Xi :  5JS  =  [r]s ;  a;* :  |SJf> 

[T;u:K\s  =  |TJ  ,\u:K 

F.2  ctyp-to-ctyp  translation:  |rjs  i-a  t 

Lf,:J5  =  ti 

\p-t\s  =  \p\s-t 

v#m\s  =  mvc\s) 

F.3  cexp-to-cexp  translation:  |ejs  >-a  e 

=  Vi 

\j).v\s  =  \j)\s.V 

F.4  mcfd-to-mcfd  translation:  |FJa  *-a  F 

[(f  =  r)Js  =  (t=|rjs) 

L(x  =  c)js  =  (*=LCJS) 

F.5  mcon-to-mcon  translation:  [CJs  i-a  C 

The  case  for  xi  cannot  occur  since  all  of  their  free  references  have 
been  replaced  by  a  new  constructor  variable  in  KMC. 


L{Fi,...,F„}Js  =  {LfiJs,...,L^Js} 

L#x(C)Js  =  #x(LCJs) 

[\u:K.C\s  =  \u:K.\C\s 

LCi[C2]Js  =  LCiJs[LC2Js] 

Lcjs  =  c 

\xi\s  =  nt(xi) 

F.6  path-to-path  translation:  [pjs  i-Ap 

[_XiJs  —  7 Tv(Xi) 

L P-X)s  =  [P\s-X 

F.7  sig-to-mtyp  translation:  [S']*  i-a  M  and  [5JS  i -a  M 

The  translation  from  EMC  signature  to  KMC  module  type  is  de¬ 
fined  as: 

[S\b  =  3u:knd(S).[S/u\s 

Here  the  internal  translation  [-Js  is  applied  to  instantiated  signa¬ 
tures  only: 

Lsig  H{,  ....Hi  end\s  =  {[H(]s,. . .  ,  [Hi\s} 
Lfsig(x,::5):>5'Js  =  Vm : knd(S).nxj :  [S/ttJa. 

{■Kt{Xi)  I-A  U,7tv(Xi)  I-A  Xi}YS'\b 

Y{xi-.sI)\s  =  {xi-.ysI\s) 

L(t»  =r)Js  =  (ti  —  [rjs) 

\_Vi-.T\s  =  ( Vi :  |rjs) 

F.8  mexp-to-mexp  translation:  T  I- 

The  translation  from  EMC  mexp  to  KMC  mexp  is  conducted  along 
the  EMC  typing  rules.  Given  a  context  T,  an  EMC  module  ex¬ 
pression  m  is  translated  into  a  KMC  expression  m'  if  and  only  if 
T  b  m  :  S  m! . 

In  the  following,  we  use  @mm'  to  denote  a  KMC  module  ex¬ 
pression  “let  Xi  —  m'  in  m(x »)”  where  Xi  is  a  module  identifier 
that  does  not  occur  free  in  m. 

Tb  p-.S  T  b  54.knd(5)  C  M  =  [S/u\s 
T  b  (p)  :  {u:knd(5)=LC'Js,  \p\s-M) 

Note:  according  to  Lemma  3.1,  S  is  an  instantiated  signature,  so 
knd(S)  is  always  an  empty  kind  {}  and  C  is  an  empty  constructor 
{}.  we  always  pack  each  structure  expression  this  way  so  that  we 
can  uniformly  translate  each  structure  identifier  Xi  in  EMC  into 
7rv(xj)  in  KMC. 

T;x,:5  b  m  :  S'  m'  P  =  {7it(xi)  i-Ati,  7Tv(xi)  i-Ax*} 
m"  —  A-u:knd(5).Axi :  [S/u\s.p(m') 

T  b  f  ct(x, :  S)m  :  fsig(x» :  S) :  S'  m" 

T  b  pi  :  fsig(Xj:5):5'  Tb  p2  ■  S" 

T  b  5"  <  S  r  b  S"4.knd(S)  =>  C 
T  b  S"  4  (s/C)  :  |_p2Js  m  p  —  {xi  A  C,  Ij  A  p] 

r  b  pi(p2) :  p(s’)  ©(LptJsILCJsD™ 


23 


_ b  r  ok _ 

T  b  str  end:sig  end~>  {«:{}  =  {},  {}:{}) 


r;  Hi] . .  . ;  Hk-i  E  dk  :  Hk  d};  d";  k  —  1, ...  ,n 
S  —  sig  Hi, . . . ,  Hn  end  K  —  knd(5) 

5'  =  sig-ffi,...,#;  end  rhS'jK^C 

m={u:K=lC\s,{d'{,...,d';}:lS/u\s) 

T  h  str  di  , . . .  ,  dn  end  :  5  let  d[  ■  ■  ■  d'n  in  to 

rh  p:S'  TE5'<5  r  h  S'4-knd(5)  c 
r  E  S'  4  (S/C)  :  LpJ5  to'  =  (to:  |5/CJS) 

Y  E  (p:>5):5^<M:knd(5)  =  LCJs,TO/:L5/M_|s} 

r  h  S  T  E  d  :  H^d'\  ■;  ■  r;ff  hmiS-vm' 

T  I-  let  d  in  to  :  5  let  d'  in  to' 

r  I-  to  :  5  to'  xti  is  new  if  =  (av :  5/7Tt  (xi)) 

T  h  (xi=m)  :  (xi\  S)  ^  (xi  =  m');(xji  =irv(xi));  H 

T  E  r  H  is  new  H  —  (tj/  —t) 

T  E  (tj  =  r)  :  (fi  =  r)  (f,=  |rJs)i  (*»'  =  *«);  H 

T  E  e  :  r  Vii  is  new  H  —  ( Vii  :  r) 

T  E  (ry  =  e)  :  (r^r)^  (ry=  |e_|s);  («,'=«»);  if 

F.9  mexp  translation  with  coercion:  r  I-  5  4-  S'  :  p  to' 

Given  two  instantiated  EMC  signatures  5  and  5',  suppose  T  E 
5  <  5' ,  the  coercion  transformation  T  E  S  4-  S'  :  p  to'  turns 
the  KMC  path  p  with  type  |_5_|s  into  a  KMC  expression  to'  with 
type  |5'JS.  Note  the  deduction  rules  given  below  only  work  when 
S  and  S'  are  instantiated  signatures. 

To  simplify  the  presentation,  we  use  X  to  represent  an  ordered 
list  of  EMC  specifications;  this  can  either  be  an  empty  list  ( s )  or  a 
specification  followed  by  another  list  (H,  X ).  Similarly,  we  use  ds 
to  represent  an  ordered  list  of  KMC  module  fields. 

T  E  S  {  S'  :  p.x  to 
T  E  (Xi :  S)  |  (Xi :  S')  :  p  ( Xi  —  to) 

T  H  (ti-r)  i  (ti-r1)  :  p^  (ti-p.t) 

T  E  (vi'.r)  l  (vi'.T1)  :  p  ^  (vi—p.v) 

_ r  E  X  l  X'  -.p^ds _ 

T  E  sig  X  end  {  sig  X'  end  :  p  {ds} 

T  E  e  }  £  :  p  £ 

r  E  Hi  }  H'l  :  p  ^  di  r;  Hi  E  X  {  X’  :  p  ^  ds 
r  E  (Hi,X)  }  (H'i,X')  -.p^dids 

Y,H  E  X  }  X'  :  p  ^  ds 
r  E  ( H,X)i(X ')  :p^ds 

S'l'  =  S[  jWi  r;  Xi :  S[  E  S'i  }  knd(5i )  =>  Ci 
r;xi:5{  I"  Si'  X  {Si /Ci)  :  nv{xi)  ^  mi 
m'i  -  (@(p[LCiJs])toi) 

S'i  =  S2/xF  r;  Xi-.S'i,  x^  :  S2  E  S'i  }knd(5^)  =►  C2 
r;  Xi'.S'i]  Xi, : S2  E  S'i  }  (S2/C2)  :  7rv(a;i0  m2 
m'2  —  (u'  ■.’knd(S'2)-  [c2\s,m2\  [S2/u'\s) 

p  —  {7rt(a!i)  1-^  U,  7Tv(Xi)  (->•  Xi] 

to  =  A-u:  knd(5{).Axj :  |_5{'JS  .p(let  av  —  m'i  in  m'2) 

T  E  fsig(a;i :  5i)  :>  52  4- f  sig(a;i :  :>  52  :  p  to 


G  Static  Semantics  for  FTC 

This  appendix  gives  the  complete  typing  rules  for  the  F 
target  calculus  FTC. 

G.1  context  formation:  E  T  ok 

E  e  ok 

T  E  M  xf  dom(T) 

E  T;  x :  M  ok 

E  T  ok  u  i  dom(Y) 

E  Y;u:K  ok 

G.2  constructor  formation:  T  E  C  :  K 

e  r  ok  u-.k  e  r 
r  E  u  :  K 

r  E  p  :  3 u-.K.M 

Y  E  nt(p)  :  K 

T  E  Ci  :  Ki  i  —  1,. .  . ,  n 
T  E  {/I  :  Cl,  :  Cn}  ■■  {h-.Kl,...,  In-.Kn} 

T  E  C  :  K'  K'  =  {...,l:K,...} 

¥Vmc)~K 

T;u:K  E  C  :  K' 

T  E  A u:K.C  :  K  -¥  K' 

T  E  Ci  :  K  ->  K'  r  E  C2  :  K 
r  E  Cl  [C2]  :  K' 

G.3  type  formation:  T  E  M 

TEC:il 
T  E  T(C) 

T  E  Mi  i  —  1, . . . ,  n 
T  E  {li-.Mi,...,ln:M„} 

Y-,x:M  E  M' 

T  E  n X-.M.M' 

Y]U\K  E  M 

Y  E  Mu-.K.M 

Y;u:K  E  M 

Y  E  3 u:K.M 


w -based 

(122) 

(123) 

(124) 

(125) 

(126) 

(127) 

(128) 

(129) 

(130) 

(131) 

(132) 

(133) 

(134) 

(135) 
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G.4  exp  formation:  Y  \-  m\  M 


b  Y  ok  x:  M  eT 

Y  \-  x:  M 

Y  h  A/. ... } 

Y  \-  p.l:  M 

Y  \~  p  :  3u:K.M  p  —  {u  ntip)} 

Y  h  7 rv(p)  :  p(M) 

T  b  rrik  :  k  —  1 ,...  ,n 

T  b  .  . ,  /n  =  TOn}  :  {(i  :  Mi, , . . ,  ln-Mn} 

Y;x:  M  h  m  :  M' 

Y  \~  \x :  M.m  :  Ylx-.M.M' 

Y  b  m  :  Ylx-.M.M'  Y  h  p  :  M 

T  1-  m(p )  :  {x  i ->p}(M') 

Y]U\K  h  m  :  M 

Y  b  Ku-.K.m.  :  Mu-.K.M 

Y  V-  m  :  Mu-.K.M  Y  h  C  ■  K 

Y  b  m[C]  :  ^  C}(M) 

r  h  ri-m:{t(4  C}(M) 

Y  b  { u:K  =  C,m:M )  :  3u:K.M 

r  b  mi  :  Mi  Y\x\Mi  b  ni2  ■  M2  T  b  M2 
Y  b  let  x  —  TTli  in  m2  :  M2 


(136) 

(137) 

(138) 

(139) 

(140) 

(141) 

(142) 

(143) 

(144) 

(145) 


G.5  constructor  equivalence:  r  b  C  =  C'.K 

Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 


p={ui->C'}  YhC'-.K  Y-,u-.K  b  C  :  K' 
r  b  (A u-.K.C)[C']  =  p(C)  :  K' 

_ r  b  C  :  A" _ 

r  b  (A u:K.C[u])  =  C  .  K  K' 

Y  b  C={.. l  =  C, 

Y  b  #Z(C)  =  C'  :K‘ 


(146) 

(147) 

(148) 


K  =  {h:Ki,...,ln:Kn} 

Y  b  Ci  =  #U{C)  :  Ki  i= 

Y  b  {h  =  Ci,...,ln  =  Cn}  ~C  :  K 


(149) 


G.6  type  equivalence:  Y  b  M  =  M' 

Rules  for  congruence,  reflexivity,  symmetry,  and  transitivity  are 
omitted. 
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